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Nature at 150: 
evidencein 
pursuit of truth 


Acentury anda half has seen 
momentous changes in science. 
But evidence and transparency are 
more important than ever before. 


n4 November 1869, the first issue of Nature 
madeits way into the world. Its ambition was 
intellectually bold and commercially risky: 
to bring news of the latest discoveries and 
inventions to scientists and the public alike. 

Although the journal was aimed at a broad audience, 
scientists took a particular liking to it — because it allowed 
them to communicate their findings quickly. Nature’s 
weekly schedule offered a refreshing contrast to the 
leisurely timescales of learned-society journal publishing 
and conference proceedings. And, as universities grew, 
more ‘letters to the editor’ from scientists started arriving 
at the Nature offices in London. Thejournal became a venue 
for publishing discoveries because its writers also became 
its readers — and we have been trying to serve scientists 
and society ever since. 

Inthis, Nature’s 150th anniversary issue, we're celebrat- 
ing and remembering many of the notable discoveries that 
authors have communicated in the journal’s pages, along 
with the agenda-setting journalism and commentary that 
has always been an essential part of our voice. 

A century and a half is long enough to see how our 
understanding of the natural world changes with each 
instalment of new evidence. Take human origins. In Feb- 
ruary 1925, Nature published the discovery by Raymond 
Dart of Australopithecus africanus in South Africa’. It was 
the first fossil link between humans and apes, and it caused 
a sensation, providing evidence that humans evolved from 
acommon ancestor in Africa as Charles Darwin had pro- 
posed — and not in Britain or Indonesia as had previously 
been thought. 

Nearly 80 years later, the discovery of the remains of 
Homo floresiensis in 2004, which came to be known as 
the hobbit, demonstrated that our genus was remarkably 
diverse”. Further revelations about human prehistory and 
evolution quickly followed, culminating in advances in 
ancient genomics. These have revealed that, as recently 
as 30,000 to 60,000 years ago, humans coexisted and 
had offspring with other hominins — Neanderthals and 
Denisovans?. 

Nature also published some of the remarkable 
developments that took place in physics in the early part 
of the twentieth century. These include James Chadwick’s 
proposal in 1932 of the existence of a new particle, the 


44 


Researchers 
and their 
remarkable 
discoveries 
have made us 
what we are.” 


ee 3 
jn z Beet neh ls 
EE © = 
RON at 1 
vy Feo BERT ae MQOOR oT 
att . E90 > 
; aN ches ee, 
ae +> = 
<n wal --— hos, octe -sHy ai re & Os, > Neat 
pie eae Soest gt Verdes eae 
2 oe . Sed : 
ar ante: 
Septet ts S35 e 
SOT \ co RE 


Of pat 
ve Neen 


8 


Nature made its debut on 4 November 1869. 


neutron, to add to the electron and the proton’. Today, 
many more fundamental particles have been discovered 
because of the predictions of the standard model of par- 
ticle physics. Some of the earliest findings of exoplanets 
appeared in our pages, including, in 1995, the first report 
of an exoplanet orbiting around a Sun-like star in another 
galaxy® — for which Michel Mayor and Didier Queloz won 
ashare of the 2019 Nobel Prize in Physics. 

Arguably, Nature’s most memorable publications were 
the reports in April 1953 on the structure of DNA — includ- 
ing papers from Maurice Wilkins® and Rosalind Franklin’, in 
addition to the paper by Francis Crick and James Watson’. 
The discovery that DNA was a double helix changed biol- 
ogy forever. Forty years later, we proudly published the 
first draft sequence of ahuman genome carried out bya 
publicly funded group, the International Human Genome 
Sequencing Consortium’. Without the researchers’ collec- 
tive achievement, medicine, agriculture, conservation and 
criminal justice would look very different today. 

There is, of course, no definitive list of the most 
influential or important pieces of research that Nature 
has published. A series of News & Views articles on 
page 35 explains the importance and lasting impact of 
ten key papers from our archive. We also chose a long list 
of 150 interesting, illuminating, entertaining and some- 
times controversial articles — one for every year of our 
life — and have been posting one per day on social media 
for the past few months. But even compiling this longer 
list involved vigorous and sometimes tense discussion 
among the editors. 

Atthe start of the year, we also began discussing what to 
feature on our anniversary issue cover. The result — a data 
analysis of Nature’s archive which highlights the multidis- 
ciplinary scope of the journal — is rendered as the extraor- 
dinary fireworks you can see on the cover and ina video 
and interactive visualization online. Our anniversary issue 
includes arich variety of written and multimedia content 
on the past, present and future of Nature and of research 
itself. 


Responsible science 

As science has advanced during the past century anda 
half, discovery has gone hand in hand with world-changing 
inventions — particularly in industrial-scale technologies. 
Many of these technologies, from the internal combustion 
engine to synthetic agrochemicals, have improved the 
quality of life for hundreds of millions of people; but at 


Nature | Vol 575 | 7 November 2019 | 7 


© 2019 Springer Nature Limited. All rights reserved. 


Editorials 


nature 


the same time they have also damaged the environment 
or raised serious ethical and safety concerns. 

Insome cases, researchers have been able to sound the 
alarmin time for remedial action, as chemists Mario Molina 
and Sherwood Rowland did inJune 1974 when they worked 
out that chlorine originating from chlorofluorocarbons 
(CFCs) was destroying atmospheric ozone”. A decade later, 
physicist Joe Farman and colleagues showed that ozone 
levels over Antarctica were lower than expected — the first 
detection of the ozone hole”. 

These findings led to the 1989 Montreal Protocol, an 
international agreement to cut ozone-depleting sub- 
stances, anda shining example of how people can unite 
to take action when scientific evidence points to an 
impending environmental disaster. Sadly, the same cannot 
yet be said for climate change, even though researchers 
have been sounding ever-louder warnings since the 1970s 
that greenhouse-gas emissions are warming the planet. 

As the pace of discovery and invention accelerates — 
fromisolating stem cells” to the development of cloning” 
and gene-editing technologies, to last month’s description 
of quantum supremacy“ — there is a clear need, perhaps 
now more than ever, for researchers and research publish- 
ers to acknowledge, and implement, our responsibility to 
society. We must commit to greater openness and ensure 
that findings are reproducible, and we must act with integ- 
rity at all times. Nature and the researchers it serves havea 
duty to work side by side with those in our broader society 
who will be affected by the products of research, and to 
consider generations to come. 


Room to improve 


Looking back, there have been times when Nature did 
not adhere to standards that we hold ourselves to today. 
We should have called out when Jocelyn Bell Burnell was 
overlooked for the Nobel physics prize for her workin the 
discovery of pulsars®. And it shouldn’t have taken until 
2007 for us to replace the phrase “scientific men” with 
“scientists” in our mission statement. 

Organized peer review — the cornerstone of scientific 
publishing — was introduced in Nature only after 1966, 
although we have tried to make up for lost time since. In 
2006, Nature conducted trials on open peer review; we 
now offer authors double-blind peer review, and are one 
of several journals in the Nature family to offer reviewers 
the opportunity to be named. 

Another area where overdue change is under way is in 
the people represented in the journal. In the early years, 
Nature was dominated by papers with one or two authors, 
mostly male, and mostly from the Northern Hemisphere. 
Today, papers with a single author are almost unheard 
of and author lists can run to the thousands, reflecting 
the increasingly team-based nature of current research. 
Although most of our authors still come from institutions 
in Europe and North America — where most research fund- 
ing is concentrated — our author community is becoming 
more geographically diverse. 

But researchers from large parts of the world, notably 
Africa, remain under-represented. This reflects broader 
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inequalities that stem from the uncomfortable historical 
reality that science and empire often worked ina symbiotic 
relationship. We recognize that Nature was founded at the 
height of such an age. Change will need time, but we are 
committed to doing more to make a difference. 


View to the future 


As the boundaries between disciplines blur and research 
becomes increasingly multi- and transdisciplinary, 
Nature is moving beyond a traditional focus on the natu- 
ral sciences to embrace social sciences, translational and 
clinical research and applied science and engineering. 
Looking to the future, we hope to contribute to greater 
transparency and openness in academia. We will probably 
see even more collaborative ways of doing research and 
more changes in the way it is published. 

Predicting the future is notoriously difficult. Writer Wil- 
liam Gibson, in his 1984 cyberpunk novel Neuromancer, 
foresaw a form of today’s stem-cell therapy and sophisti- 
cated artificial intelligence, but failed to anticipate mobile 
phones. Even inthe early 1990s, relatively few people antic- 
ipated that ‘electronic publishing’, as it was starting to be 
called, would jeopardize the future of mass-produced 
printed journals. The most exciting and dramatic changes 
will be the ones we cannot imagine today. 

It’s unlikely that our founders imagined that, 150 years 
on, Nature would be publishing more than 850 research 
papers and 3,000 articles of news, opinion and analysis 
each year, and reaching around 4 million readers online 
each month. That’s your doing: researchers and their 
remarkable discoveries have made us what we are. We have 
reached this important milestone only through listening, 
responding and adapting to the community we serve. 

In other respects, Nature now is just the same as it was 
at the start. We will continue in our mission to stand up 
for research, serve the global research community and 
communicate the results of science around the world. We 
will strive to hold to account those in positions of respon- 
sibility in research, policy and industry, and to continue 
to advocate for fewer unintended harmful consequences 
of research for people and the planet. 

Research, science, knowledge, scholarship — how- 
ever we might choose to characterize the marshalling of 
evidence in the pursuit of truth — the values we hold are 
more important than ever before. 
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VERONICA ADROVER 


A personal take on science and society 


World view 


By Paul Smaldino 


Better methods can’t make 
up for mediocre theory 


With better questions, many reproducibility 
problems will fall away, says Paul Smaldino. 


uch digital ink has been spilt describing ways 
toimprove replicability in science. Prereg- 
istration. Open data. Open code. These are 
all necessary, but insufficient. The thing is, 
we don’tjust want science to be reproduci- 
ble. We wantit to help us to make better sense of the world. 

For that, we must create better hypotheses — and those 
require better models and better measurements. 

A theoretical model of mine (P. E. Smaldino and R. 
McElreath Soc. Open Sci. 3, 160384; 2016) made headlines 
when it showed that bad science — or rather, less rigorous 
science that could produce more papersin less time — could 
crowd out the more robust sort. This suggested that gener- 
ating better hypotheses is at least asimportant as reducing 
methodological errors for minimizing false discoveries. 

Who cares if you can replicate an experiment that found 
that people think the room is hotter after reading a story 
about nice people? Will this help us to develop better 
theories? You can craft a fun story about that result, but 
can you devise the next great scientific question? 

To generate good hypotheses, we need good theory. In 
a landmark study attempting to replicate 100 psychology 
papers, cognitive-psychology studies were replicated about 
twice as often as those from social psychology (Open Science 
Collaboration. Science 349, aac4716; 2015). I think that’s 
because cognitive psychology has better theories. 

Good theory has at least two requirements. First, itcan be 
used to build mathematical or computational models that 
derive clear, testable consequences from our assumptions. 
Every mature scientific discipline has these. Physicists use 
models of force and momentum to predict the motion of 
materials. Epidemiologists use models of contagions to 
understand the spread of disease. Neuroscientists use mod- 
els of neural-spike trains to understand information flowin 
the brain. Social scientists use game models to understand 
the emergence of social norms. 

Second, good theory must make sense, or at least 
acknowledge its contradictions. Consider the ‘pre-cogni- 
tion’ studies of US social psychologist Daryl Bem, which 
were completed with remarkable transparency (D.J. Bem 

J. Pers. Soc. Psychol. 100, 407-425; 2011). (The general con- 
sensus is that these studies did not establish the presence of 
extrasensory perception in college students, but the preva- 
lence of overly flexible statistics; Bem defends the statistics 
as sound.) The work flouted well-supported ideas about 
physics and causality. It was akin to when physicists at CERN, 
Europe’s particle-physics laboratory near Geneva, Switzer- 
land, ‘discovered’ faster-than-light neutrinos, violating the 
special theory of relativity. Because the researchers required 
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their results to be consistent with a broad theoretical frame- 
work, they probed deeper and discovered that their finding 
stemmed froma loose fibre-optic cable. To be clear, it’s not 
the case that surprising claims are always wrong — but such 
claims must undergo extensive scrutiny. 

Ifuseful models produce better science, then what drives 
better models? Improved measurements. Consider the work 
of Tycho Brahe — a great astronomer of the sixteenth cen- 
tury, who nonetheless thought that the Sun orbited Earth. 
Yet his painstaking measurements of the positions of the 
planets allowed Johannes Kepler to determine that their 
orbits are elliptical. From this, Isaac Newton could formalize 
his theory of universal gravitation, which allowed modern 
researchers to ask countless questions about planetary 
motion, cosmology, ballistics, engineering and more. 

If we can’t reliably measure something, it’s hard to build 
a theory about it. Quantities such as position, mass and 
time are relatively easy to measure, at least at some scales. 
Cognitive scientists can readily measure skin conductance, 
reaction times and word counts; this allows regularities and 
variation to be observed, and thus the construction of test- 
able models. Other fields, including those I work in, have 
struggled with measurements. Psychologists attempt to 
measure emotions, identities and beliefs. Social scientists 
attempt to measure inequality, polarization and disinforma- 
tion. Biomedical scientists attempt to measure treatment 
outcomes in small, heterogeneous populations. 

I think that many sciences struggling with replication 
are those with the most pressing challenges in taking clear 
measurements. The trick lies not in merely finding a meas- 
urement that can be made precisely or described trans- 
parently, although these factors are important. Instead, 
scientists must find properties that can be reliably meas- 
ured, inform theory and lend themselves to quantification 
in formal models. 

Ideally, strong theories, formal models and measure- 
ments will interact in a virtuous cycle. Models allow us 
to study assumptions about the world and discover their 
consequences. The results can show what measurements are 
needed to test the assumptions, and those measurements 
can provide empirical patterns that invite explanations, 
which models can provide. And onand on. 

We absolutely need better methods for hypothesis 
testing, and these are already being incorporated into how 
scientists are trained and how science is done. 

So now it is time to focus on better practices for 
hypothesis generation. We need training programmes in 
model building and critique, plus consortia-building and 
funding programmes toinvent and test measurements that 
make models tractable. 

Better methods will help us get the right answers; models 
and measurements will ensure we ask the right questions. 
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The world this week 


Newsin brief 


CLIMATE FUND ATTRACTS 


RECORD SUMFOR 
DEVELOPINGNATIONS 


Developed nations have 
together pledged US$9.8 billion 
to replenish a United Nations 
fund that helps low-income 
countries to reduce their carbon 
emissions and adapt to the 
impacts of climate change. 

Ataconference on 24-25 
October in Paris, 27 countries 
promised to contribute to 
the latest fund-raising round 
for the Green Climate Fund 
(GCF). The total value of these 
pledges exceeds the $9.3 billion 
promised in the last roundin 
2014, despite the absence this 
time of the United States and 
Australia. Thirteen nations, 
including the United Kingdom, 
Germany and France, pledged 
at least double what they did 
five years ago, in domestic- 
currency terms. 

The fund was established in 
2010 and has so far allocated 
$5.2 billion to climate-change 
mitigation and adaptation 


projects around the world. 

The United States committed 
more money to the GCF than 
any other nation in 2014, but 
US President Donald Trump has 
since withdrawn $2 billion of the 
$3 billion that was promised, 
and has declined to contribute 
further to the fund. This left a 
substantial hole in the GCF’s 
coffers, although European 
nations have largely made up 
the shortfall. 

The fund remains open, and 
itis likely that more countries 
will make pledges in the coming 
months. Countries that have 
been stymied by domestic 
political processes could also 
increase the amount they have 
said they will give. More funding 
is expected from Belgium, for 
instance, where a parliamentary 
resolution to double its 
$45-million contribution came 
too late to be reflected in its 
most recent pledge. 


CLIMATE CASH 


In the latest fund-raising session, 27 countries pledged 


US$9.8 billion to the Green Climate Fund. 


2014 pledge ™ 2019 pledge 
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Other countries 


INTERSTELLAR 
COMET CONTAINS 
ALIEN WATER 


Astronomers have for the first 
time spotted signs of water in 
our Solar System that originated 
somewhere else. The alien 

water seems to be spraying off 
comet 21/Borisov, which is flying 
towards the Sun on ajourney 
from interstellar space, reported 
ateam led by Adam McKay at 
NASA's Goddard Space Flight 
Center in Greenbelt, Maryland, 
on 28 October (A. McKay et al. 
Preprint at https://arxiv.org/ 
abs/1910.12785; 2019). 

“There’s water — that’s 
cool, that’s great,” says Olivier 
Hainaut, an astronomer at the 
European Southern Observatory 
in Garching, Germany. Most 
comets contain a lot of water, 
he says — but confirming its 
presence in an interstellar comet 
is animportant step towards 
understanding how water might 
travel between the stars. 

Astronomers have been avidly 
tracking Borisov ever since its 
discovery on 30 August. It is only 
the second interstellar object 
ever spotted. 

McKay and his colleagues used 
a3.5-metre telescope at Apache 
Point Observatory in Sunspot, 
New Mexico, to probe the 
sunlight reflecting off Borisov. 
On 11 October, they spotted 
signs of oxygen in light coming 
from the comet. Although 
comets can produce oxygen 
in various ways, researchers 
say that the most likely source 
is water breaking apart into 


Major climate 
conference 
swaps venue 
amid protests 


700 
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The United Nations’ annual climate summit will decamp 
to anew continent, as a result of massive protests 
against economic inequality (pictured) that have 
rocked Chile for nearly two weeks. 

On 30 October, Chilean President Sebastian Pifiera 
cancelled plans to host the climate meeting, known 
as the 25th annual Conference of the Parties (COP25), 
which was due to start in December in Santiago, citing 
safety concerns. A day later, Spanish President Pedro 
Sanchez offered to host the summit in Madrid, a 
proposal the UN has accepted. The summit will still take 
place on 2-13 December. 

Countries attending COP25 plan to work out the 
details of implementing the Paris climate agreement 
ahead of 2020, when they will update their climate 
pledges under the international pact. As many as 
25,000 people are expected to attend the talks. 

The cancellation was the latest in a series of obstacles 
for the climate conference. Chile had agreed last year 
to host the talks after Brazil backed out of holding the 
meeting. 


MEASLES ERASES 
IMMUNE ‘MEMORY’ 
FOR OTHER DISEASES 


Measles infections in children 
can wipe out the immune 
system’s memory of other 
illnesses. This can leave children 
vulnerable to other pathogens 
that they might have been 
protected from before their 
bout of measles. 

The findings, published 
on 31 October in Science and 
Science Immunology, come 
at atime when measles cases 
are spiking. Globally, there 
were more measles infections 
in the first six months of 2019 
than in any year since 2006, 
according to the World Health 
Organization. 

The measles virus seems 
to destroy immune cells that 
‘remember’ encounters with 
specific bacteria and viruses 
(V.N. Petrova et al. Sci. Immunol. 
4, eaay6125; 2019). And results 
from a separate team indicate 
that measles can damage plasma 
cells in the bone marrow, cells 
that could otherwise produce 
pathogen-specific antibodies 
for decades (M.J. Mina et al. 
Science 366, 599-606; 2019). 

The findings emphasize how 
the measles vaccine protects 
against more than just measles, 
says Velislava Petrova, an 
immunologist at the Wellcome 
Sanger Institute in Hinxton, UK, 
who led the Science Immunology 
study. 


SOUTH KOREA 
CLAMPS DOWNON 
“WEAK’ CONFERENCES 


South Korea’s education 
ministry wants to stop 
academics from participating 
in conferences that have little 
academic value. The ministry 
announced on 17 October that 
it will require all universities 
to adopt measures to vet 
academics’ travel to overseas 
conferences so as to “prevent 
researchers from engaging in 
poor academic activities”. 

The ministry’s order 
comes after a report that it 
released in May, which found 
that 574 professors from 
90 universities around the 
country had participated in 
conferences that it called 
“weak”. It is thought that some 
researchers knowingly elect 
to pay the fees to attend such 
conferences, or to publish in 
low-quality journals — some 
of which are considered 
‘predatory’ journals — because 
they area quick and easy 
way to add a publication or 
presentation to their CVs, or to 
gain experience in presenting at 
international conferences. 

Under the new policy, 
researchers will be required 
to fill out checklists before 
attending overseas conferences 
and then submit the lists to 
their universities, which will use 
them to screen the adequacy of 
the researchers’ academic and 
research activities, the ministry 
told Nature ina statement. 
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The world this week 


News in focus 


The Kincade Fire has burnt a swathe through Sonoma County, California, since it began on 23 October. 


CALIFORNIA SCIENTISTS 
RACE TO ASSESS HEALTH 
RISKS OF WILDFIRE SMOKE 


Bay Area study will track long-term effects of 
pollution on the heart, lungs andimmune system. 


By Amy Maxmen 


sthe skies above the San Francisco Bay 

Area in California filled with smoke in 

late October from wildfires ripping 

through nearby Sonoma County, Kari 

Nadeau and Mary Prunicki sprang 
into action. 

The pair, scientists at Stanford University in 
the Bay Area, began calling in hundreds of peo- 
ple who had signed up to participate in their 
study of the long-term health effects of wild- 
fire smoke. Previous research has linked air 
pollution from wildfires to surges in hospital 
visits for asthma and strokes. But it’s not clear 
whether exposure to wildfire pollution creates 


chronic health problems — something that 
Nadeau, director of Stanford’s Sean N. Parker 
Centre for Allergy & Asthma Research, and Pru- 
nicki, a pollution biologist, hope to find out. 
In early October, before the first large wild- 
fires of the year sparked in northern California, 
their team assessed the circulatory, respira- 
tory and immune systems of people enrolled 
in the study. The scientists began calling 
participants back to their lab on 28 October 
to undergo the same tests, which they will 
repeat in three months after the smoke has 
cleared. Nadeau and Prunicki have approval 
to continue assessments until 2037, and ulti- 
mately hope to enrol as many as 2,000 people 
— amassing a trove of data on how a person’s 


© 2019 Springer Nature Limited. All rights reserved. 


body responds to wildfire smoke over time. 
Answers are sorely needed. Wildfires 
burned a record-breaking 760,000 hectares 
last year in California; almost 100 people died 
and hundreds of thousands of others breathed 
in sooty air for days. As Nature went to press, 
the massive Kincade Fire in northern California 
had burnt about 32,000 hectares, destroyed 
more than 370 structures and prompted evac- 
uations and power outages (see ‘Wildfires 
disrupt science’). And climate models predict 
that such blazes will grow larger and more fre- 
quent in the coming decades. The area burnt 
in California each year will increase by 77% by 
the end of the century if greenhouse-gas emis- 
sions continue to rise, according to the state’s 
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News in focus 


The area burnt by wildfires in California is projected to rise as climate change intensifies. 


most recent climate-change assessment. 

Lisa Miller, an immunologist at the 
University of California, Davis, says the Stan- 
ford study is one of the first to monitor wild- 
fires’ health effects in a diverse group of people 
over several years. By understanding who is 
most affected by wildfire and why, Miller says 
researchers can create evidence-based guide- 
lines for mitigating risk. She is particularly 
worried that smoke exposure could damage 
children’s developing lungs in ways that lead 
to chronic health problems. 

“We have to be better prepared for these 
events,” she says. “Last year was everyone’s 
wake-up call that we need to be ready for the 
next big fire to happen.” 

The idea for the health study arose last year, 
as the largest and deadliest blaze in California’s 
history — the Camp Fire — ravaged the north- 
ern part of the state. After the fire destroyed 
the city of Paradise in California’s Central 
Valley and turned skies brown above the Bay 
Area, Nadeau and Prunicki realized their skills 
were needed. 

The pair has long studied how air pollu- 
tion in the central California city of Fresno 
alters immune cells and causes allergies and 
asthma. In April, they reported that 7- and 
8-year old children living about 100 kilometres 
away from wildfires in 2015 were exposed to 
more pollutants than were those living near 
prescribed burns — small forest fires that 
are purposely set to reduce overall fire risk 
(M. Prunickietal. Allergy 74, 1989-1991; 2019). 

The researchers suspect that the difference 
is due to toxic chemicals released when wild- 
fires burn synthetic materials in houses and 
cars. “Wildfire is like a giant slug of air pollu- 
tion all at once,” Prunicki says. 

As the smoke from the Camp Fire hung 
over the Bay Area last year, she and Nadeau 
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scrambled to launcha small study tracking the 
effects of wildfire pollution on health. They 
collected blood and saliva from about 100 peo- 
ple, and asked them to return for assessments 
afew months later. “We didn’t have time to col- 
lect tons of information, and it was sort of done 
inreverse,” says Nadeau. 

In February, they submitted a proposal fora 
larger study to an ethical review board. To fund 
the work, they set aside about US$1 million 
froma grant they had received from the Parker 
Foundation in San Francisco. 

The team is conducting the wildfire research 
in the Bay Area because the air quality is typi- 
cally better than that of Fresno, says Nadeau. 
This should help her team to isolate the health 
effects of wildfire smoke from those caused by 
other environmental hazards. 

The scientists also began an 80-person 
study last week to test whether air purifiers 
can limit any health effects from exposure to 
wildfire smoke. Half of the students ina college 
dormitory in Fresno have air-filter machines 
installed in their rooms, and the other half have 
asham machine. The goal is to work out how 
much air filters help, and who needs them. 

Michael Wara, an energy and climate policy 
analyst at Stanford, hopes to incorporate data 
from Nadeau and Prunicki’s health studies into 
models on the costs and benefits of various 
policies to curb wildfire damage. “Fire is a cli- 
mate-adaptation problem that California is 
confronting right now,” he says, “and not in 
2050, and notin 2100.” 

The researchers behind the northern Cali- 
fornia health study hope that its findings will 
help people around the world who are exposed 
to wildfire smoke. “This isn’t just a problem 
for the [US] west,” Nadeau says. “We need to 
know how to adapt better. Right now, people 
are left unaware.” 
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Wildfires 
disrupt science 


Power cuts have added to 
uncertainty for researchers. 


The blazes that have torn through 
California since late October have 
prompted evacuations and power outages 
that have disrupted research. 

The University of California, Berkeley 
(UCB), and the neighbouring Lawrence 
Berkeley National Laboratory (LBNL) 
were among the institutions in northern 
California that closed on 26 October as a 
result of a planned blackout that followed 
the Kincade Fire, which broke out on 
23 October near Santa Rosa. 

This was the second outage in a month. 
The first, which occurred on 9-10 October 
when the Pacific Gas and Electric Company 
(PG&E) of San Francisco, California, cut 
power to reduce the risk of wildfires, 
caused the most chaos. One UCB lab 
moved freezers full of specimens to nearby 
facilities that still had power, while others 
stocked up on dry ice to keep their samples 
frozen. 

But researchers say that things went 
more smoothly during the second 
blackout. That time around, university 
officials pre-emptively switched to a 
campus power plant before PG&E cut 
electricity to the area on 26 October. 
Despite the campus closure, researchers 
were still able to access facilities to check 
on their samples and experiments, but they 
had to scramble to relocate meetings. 

A conference on the popular gene- 
editing technique CRISPR, scheduled for 
26 October, had to be moved off campus, 
says Jennifer Doudna, a biochemist at UCB. 
Organizers streamed the meeting online 
for those who couldn't squeeze into the 
smaller space. “How can we be living ina 
state with the fifth-largest economy in the 
world and having power outages like this?” 
Doudna asks. 

She hopes the situation will push 
lawmakers and PG&E to bolster the grid to 
avoid such disruptions in the future. “| don’t 
think this type of climate is going away,” 
says Doudna. “We have to plan for it.” 

UCB resumed normal operations 
on 29 October, and LBNL reopened on 
30 October. In Los Angeles, another blaze, 
the Getty fire, prompted the University of 
California, Los Angeles, to cancel classes 
for one day on 28 October. 


By Jeff Tollefson 
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PRIMATE EMBRYOS GROWN 
IN THELAB FOR LONGER 
THAN EVER BEFORE 


The 20-day-old monkey embryos could reopen the debate about 
how long the human variety should be allowed to grow ina dish. 


Two groups in China have grown embryos from cynomolgus monkeys for 20 days. 


By David Cyranoski 


hey are the longest-lived primate 

embryos to thrive outside the body. 

The monkey embryos survived in a 

dish for 20 days, thanks to techniques 

developed by two groups working in 
China. The work sheds light on a crucial but 
little-understood phase of early development, 
and will probably reignite the debate about 
how long human embryos should be permitted 
to develop in the laboratory. 

Researchers grow embryos to understand 
the earliest stages of development. In 2016, 
biologists in the United States grew human 
embryos in the lab for 13 days, but then 
stopped the experiments because of an inter- 
nationally accepted rule not to allow growth 
beyond 14 days for ethical reasons. Because 
monkeys area closely related species, their 
embryos are a window into early human devel- 
opment, but scientists have previously grown 
them for only nine days. 

The two teams in China now report in 
Science’’ that lab-grown embryos from 
cynomolgus monkeys (Macaca fascicularis) 


underwent several crucial processes. In one 
of these, gastrulation, the basic cell types that 
give rise to different organs begin to emerge, 
at around day 14. 

“The best part is that there is a system 
to study gastrulation in vitro in a model 
very similar to the human,” says Magdalena 
Zernicka-Goetz, a developmental biologist 
at the California Institute of Technology in 
Pasadena. “This is very exciting.” 

Although the studies show that early monkey 
development mirrors many aspects of the first 
two weeks of the human process, the teams 
report subtle differences between the two 
species. This suggests that monkey embryos 
might not be an adequate model for studying 
some advanced stages of human development, 
says Pierre Savatier, a stem-cell biologist at the 
Stem-cell and Brain Research Institute in Bron, 
France. He predicts that the papers will reinvig- 
orate a push to extend the 14-day policy. 

The ability to grow monkey embryos for 
longer than ever before could also boost 
research in another hot and controversial field 
— the generation of hybrid human-monkey 
embryos, known as chimaeras, with the goal 


ofinvestigating how human cells differentiate 
into organs. This research has been held back 
because researchers haven't been able to grow 
monkey embryos for long enough to see how 
the injected human cells behave. Savatier says 
he will use the culture technique to grow mon- 
key embryos that will be injected with human 
stem cells. “This culture system is hugely 
important for chimaera experiments,” he says. 


Embryo bonanza 

Both teams grew monkey embryos ona gel 
matrix that supplied higher levels of oxygen 
than do cells in the womb. This culture tech- 
nique was developed by Zernicka-Goetz’s 
team, which was one of two groups” in the 
United States that grew human embryos for 
13 days, in 2016. 

In one of the latest two papers, a team led 
by Juan Carlos Izpisua Belmonte, a devel- 
opmental biologist at the Salk Institute for 
Biological Studies in La Jolla, California, 
and Ji Weizhi at the Yunnan Key Laboratory 
of Primate Biomedical Research in Kun- 
ming, China, reports that 46 of 200 monkey 
embryos survived to 20 days. The authors of 
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the other paper, led by LiLei, adevelopmental 
biologist at the Institute of Zoology, Chinese 
Academy of Sciences, in Beijing, say they grew 
three embryos for that long. 

The teams tracked the progress of the 
embryos, which were created using in vitro fer- 
tilization, to check whether they grew as they 
would have in the womb. They examined the 
timing and shape of structures inthe embryos 
and the structures that support embryonic 
growth, the types of protein that are expressed 
by cells at different stages and the primordial 
germ cells that go onto become eggs or sperm. 
Then they compared these observations with 
what is known about development of this spe- 
cies from past experiments, in which embryos 
were removed from pregnant monkeysat dif- 
ferent stages up to17 days. 

Both groups report that embryos ina dish 
develop inthe same way as those in the womb. 
“It’s ok to assume that the observations made 
are arepresentation of what happens in vivo,’ 
says Izpisua Belmonte. 

The teams stopped their experiments on 
day 20, when the embryos turned dark and 
some cells detached — signs that the struc- 
tures were collapsing. Li says it’s not clear 
why that happened. He and Izpisua Belmonte 
say that culturing the cells in an extracellular 
matrix that better mimics the womb might 
help them to survive longer. Next, Ji hopes to 
grow embryos tothe point when the primitive 
nervous system starts to form, around day 20. 


Sutble differences 


Savatier says one difference between monkey 
and human embryos, described in the Ji and 
Ispizua Belmonte paper, is that the genes that 
are expressed in monkey cells that form the 
placenta are different from those in humans. 
But to study these processes in later stages in 
human embryos, regulators would need to lift 
the 14-day ban. 

After the US teams grew human embryos 
to 13 days, some scientists and ethicists 
pushed for a revision of the policy, and a poll 
conducted in the United Kingdom in 2017 
reported strong public support for extending 
the limit beyond 14 days. Savatier and others 
think the latest results showing the unique fea- 
tures of human embryonic development will 
strengthen arguments to change the policy. 

Researchers are optimistic that the gel 
matrix could be used to grow human embryos 
to amore advanced stage if the rules change. 
Ji says that another group at his institute has 
developed a protocol specifically for human 
embryos that will soon be published. “This 
system could be suitable for human embryos 
to becultured to 20 days,” he says, “but we are 
not planning to doit.” 


1. Niu, Y. etal. Science http://doi.org/ddn3 (2019). 
2. Ma,H. et al. Science http://doi.org/ddn4 (2019). 
3. Deglincerti, A. et al. Nature 533, 251-254 (2016). 
4. Shahbazi, M. N. et al. Nature Cell Biol. 18, 700-708 (2016). 
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GENOMES TRACE ORIGINS 
OF ENSLAVED PEOPLE 
WHO DIED ON ISLAND 


Former slaves left on St Helena were probably taken 
from west-central Africa, a genome study finds. 


By Ewen Callaway 


enomes from enslaved Africans who 
were freed and died on a remote 
Atlantic island in the mid-nineteenth 
century are offering clues about their 
origins in Africa. The findings come 
fromthe largest study of genome data obtained 
so far from remains of enslaved people and 
offer insights into the transatlantic slave 
trade, inwhich an estimated 12 million Africans 
were kidnapped and enslaved in North and 
South America and the Caribbean. 
Researchers analysed DNA taken from 
20 people from the British island territory 
of St Helena, whom the British Navy had 
liberated and brought there. The research, 
posted on the bioRxiv preprint server last 


“By illustrating the history 
and the condition ofa few, 
we are at the same time 
illustrating the condition 
of the many.” 


month, suggests that the people might have 
been captured in parts of west-central Africa, 
including present-day Angola and Gabon 
(M. Sandoval-Velasco etal. Preprint at bioRxiv 
http://doi.org/ddq2; 2019). 


Noisland paradise 


St Helena, which lies in the Atlantic Ocean 
nearly 2,000 kilometres west of Angola, 
occupies a unique chapter in the history 
of the transatlantic trade in people. After 
Britain outlawed the slave trade in 1807, its 
navy intercepted slave ships and sent an esti- 
mated 24,000 people to St Helena. They had 
been aboard ships heading largely to Brazil 
and Cuba between 1840 and the late 1860s. 

Many of the people freed arrived in poor 
health and were housed in squalid condi- 
tions, and as many as 10,000 died. In 2006, 
construction work uncovered mass burials, 
and archaeologists unearthed the remains of 
325 people — more than half under 18. 

Unlike cemeteries in the Americas, which 
tend to hold multiple generations of people 
who had once been enslaved, nearly all of the 


people who died on St Helena were likely to 
have been born in Africa. 

Shipping records — the main historical 
source on the African origins of people taken 
into captivity — tend to record only the ports 
from which slave ships set sail, but other 
records suggest that many of the people were 
captured farther inland. 

Inan attempt to better trace the Africans left 
on St Helena, a team led by palaeogenomicist 
Marcela Sandoval-Velasco and ancient-DNA 
researcher Hannes Schroeder, both at the 
University of Copenhagen, tested remains 
from 63 of the people for intact DNA. They 
sequenced partial genomes from 20. 

Seventeen were male — backing up records 
indicating that, in its final decades, the trans- 
atlantic slave trade captured more men than 
women. Analysis of the genome data found that 
none of the people were closely related, nor 
did they belong toasingle African population. 

Comparisons with genome data from 
thousands of modern Africans from dozens 
of populations suggest that the people from 
St Helena are most closely related to people 
living today in central Gabon and northern 
Angola. But the researchers caution that gaps 
in present-day genome data from potential 
homelands, suchas the Democratic Republic 
of the Congo, make it difficult to say for cer- 
tain where the people buried in St Helena were 
taken from. “Although it’s very hard to exactly 
pinpoint their origins, I think what we see in 
our results is that they are not coming from 
asingle population,” says Sandoval-Velasco. 

This insight suggests that the liberated 
Africans on St Helena lived in a challenging 
multicultural setting where they might not 
have understood the language and customs 
of others left on the island. “We hope that by 
illustrating the history and the condition ofa 
few, we are at the same time illustrating the 
condition of the many, but it shouldn’t stop 
there,” Sandoval-Velasco says. 

Genome analysis shines a powerful light 
on people exploited in one of history’s dark- 
est chapters, says Rosa Fregel, a population 
geneticist at the University of LaLagunainthe 
Canary Islands. “Usually it’s just about num- 
bers — how many people from each country. 
Here, we are talking about particular people 
and their origin,” says Fregel. “Ancient DNA 
has the potential to tell their story.” 
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Communities will receive 1.5% of the farm gate price, equivalent to US$799,000 in 2019. 


INDIGENOUS COMMUNITIES 
WIN HISTORIC RIGHT TO 
ROOIBOS TEA PROFITS 


Industry agrees to pay but contests research that San 
and Khoi used rooibos before European settlers. 


By Linda Nordling 


orethanacentury after commercial 

farming began on their traditional 

lands, the San and Khoi peoples of 

southern Africa will share in the 

profits of the lucrative rooibos 
tea industry, the South African government 
announced on 1 November. 

The announcement is the culmination of a 
decade-long negotiation between industry 
representatives and San and Khoi community 
groups. A review of the historical and ethno- 
botanical research literature published by the 
South African government in 2015 concluded 
that there is a “strong probability” that the first 
users of rooibos were the San people and that 
they — and the Khoi— should be compensated 
by industry (see go.nature.com/2jkviye). 

Industry representatives have agreed 
to the payments, but say that they do not 
accept the government’s interpretation of the 
research. They conducted their own literature 
review, whichcomes toa different conclusion. 

By contrast, representatives of the commu- 
nities and the government have welcomed the 
agreement — the largest of its kind between 


Indigenous peoples and industry. 

“Rooibos is part and parcel of my upbring- 
ing,” says Collin Louw, chairman of the San 
Council of South Africa. He says his ancestors 
used it to soothe skin rashes, among other 
things. 

The agreement is also significant because 
it’s the first such arrangement since the 2010 


“These are sensitive issues. 
Theconcerns of centuries 
cannot just be resolvedina 
few years.” 


ratification of the Nagoya Protocol of the 
United Nations Convention on Biological 
Diversity. This is an international law that 
sets the rules for compensating communities 
if their knowledge of biodiversity is used by 
businesses or scientists. 

Tim Hodges, a Canadian diplomat who 
co-chaired the Nagoya process, calls the 
rooibos agreement a historic achievement and 
a model for other countries and industries. 

In particular, the announcement will be 
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closely read by researchers and funders 
involved in the Intergovernmental Science- 
Policy Platform on Biodiversity and Ecosys- 
tem Services (IPBES), an effort to provide 
scientific advice on the world’s epic loss of 
biodiversity. 

According to the IPBES, traditional knowl- 
edge — defined by the World Intellectual 
Property Organization as knowledge that is 
handed down from generation to generation 
— of biodiversity is key to discovering as-yet 
undescribed species. South Africa’s decision 
suggests that brokering agreements to access 
this knowledge will take time. 

“These are sensitive issues,” says IPBES 
member Unai Pascual, an environmental econ- 
omist at the Basque Centre for Climate Change 
in Bilbao, Spain. “The concerns of centuries 
cannot just be resolved ina few years.” 


Counting the cost 


Under the rooibos agreement, the San and 
Khoi communities will receive 1.5% of the ‘farm 
gate price’ — the price that agribusinesses pay 
for unprocessed rooibos (Aspalathus linearis), 
which is endemic to the Cederberg region 
north of Cape Town. 

For 2019, the government considers that 
the compensation will amount to 12 million 
rand. (US$799,000). The communities will 
split the proceeds fifty-fifty. A third group — 
small-scale non-white rooibos farmers in the 
region who were disadvantaged under apart- 
heid — will share in the Khoi portion. 

The Sancommunities are among the world’s 
oldest, and are understood to have been in 
Southern Africa for some 100,000 years. The 
Khoi arrived more recently, about 2,000 years 
ago. European settlers attacked these com- 
munities and occupied their lands starting in 
the mid-1600s, and today both San and Khoi 
peoples are scattered throughout southern 
Africa. 

Commercial rooibos farming, which is 
now worth an estimated 500 million rand a 
year, began on these lands in the early 1900s, 
industry representatives say. In 2010, the 
San Council of South Africa approached the 
government witha claim under South Africa’s 
biodiversity law, asking for compensation 
for its peoples’ traditional knowledge of the 
plant and for the use of San imagery in rooibos 
packaging and marketing. 


A question of origins 
In spite of the agreement, the precise origins 
of rooibos tea remain contentious. Repre- 
sentatives of the San and Khoi say that their 
ancestors shared knowledge of the plant with 
colonial settlers. The literature also points to 
its early uses as a health tea and as a diuretic. 
Overall, there is a lack of research litera- 
ture in this field, but what there is suggests 
that rooibos as a beverage did originate with 
the groups’ ancestors, according to the 2015 
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study commissioned by the South African 
government. 

However, a separate 2017 report commis- 
sioned by the industry-led South African 
Rooibos Council (SARC) says that there is 
no conclusive recorded evidence that the 
original inhabitants of the Cederberg region 
used rooibos to brew tea, or that they taught 
colonial-era settlers about it (see go.nature. 
com/2ndv2zq). 


Alternative interpretations 


This disagreement caused deadlock as each 
side stuck to its interpretation of the research. 
SARC chairman Martin Bergh, who is also 
managing director of one of the largest roo- 
ibos agribusinesses, Rooibos Ltd, based in 
Clanwilliam, South Africa, says the industry 
still does not accept that Indigenous commu- 
nities used rooibos as tea. But he does agree 
that the San probably knew about the rooibos 
plant before anybody else. 

However, the government accepted that 
the communities deserve to be compensated 
because rooibos is endemic to where it is now 
grown (see ‘The rooibos belt’), and the San 
and Khoi lived there for centuries before the 
settlers. The 2015 study also found no evi- 
dence casting doubt on the communities’ 
argument that their ancestors used rooibos 
as a beverage. 


THE ROOIBOS BELT 


Rooibos is endemic to Cederberg, 
one of South Africa's most biodiverse 
regions, which is north of Cape Town. 
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All sides have pledged to revisit the agree- 
mentina year’s time, because other questions 
also remain. For example, Rachel Wynberg, 
who researches the commercialization of bio- 
diversity at the University of Cape Town, ques- 
tions how the funds will reach San and Khoi 
individuals, many of whom are not well-con- 
nected with Indigenous leadership structures 
such as the San Council. She also says that 


establishing which community contributed 
how much to the plant’s uses is “fraught with 
difficulties”, andis likely to lead to arguments 
— especially where communities have been 
oppressed and marginalized. 

And Barend Salomo, who manages a coop- 
erative of small-scale farmers in Wupperthal 
who will also benefit, says that the money for 
his community will not stretch far. He hopes 
that the agreement can be tweaked to provide 
greater dividends in future years. “We don’t 
want to kill the industry, but this is not fair,” 
he says. 

But Willie Nel, alarge-scale rooibos farmer 
based outside Clanwilliam, says that if farm- 
ers are unable to recoup the cost of the levy by 
charging higher prices, they will lay off farm 
labourers, who, he says, are mostly members 
of the third group that the agreement is sup- 
posed to help. Nel also worries that mount- 
ing demands for restitution might lead to the 
industry pricing itself out of the market. “We 
think rooibos is special but there are so many 
teas from all over the world. And we have to 
compete with all of them.” 

But as more than 80% of rooibos tea is sold 
in Europe, Japan and North America, envi- 
ronmental economist Pascual doesn’t think 
these sales are likely to be affected by a small 
increase in price. “That’s not how economics 
works,” he says. 


R. WYNBERGS. AFR. J. BOT. 110, 39-51 (2017). 
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The median number of 
authors listed on papers 
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the past two decades, 
especially in biomedicine, 
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collaborative research. 
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GENDER 
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The percentage of 
scientific papers with 
female authors seems 
to have grown over 
time. However, the 
trend is not clear 
because the algorithm 
used to assign genders 
was only able to classify 
fewer than one-third of 
authors’ names in 
earlier years — many 
items were signed using 
initials — and only 
three-quarters of names 
in the current era. 
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The evolution of scientific illustration 


The interplay of image-making, research and visual 
technologies over the past 150 years. By Geoffrey Belknap 


cience is a fundamentally visual 

endeavour. It pivots on the material 

— whether that is an atom, a gene, a 

crystal, a whale or a distant galaxy. Its 

aimis elucidation. Thus, communicat- 
ing research has always been predicated on 
combining image and text to share discoveries, 
ideas and observations. 

When it came on the market in November 
1869, Nature stated its commitment to the 
visual with a beautifully drawn masthead 
showing Earth emerging from clouds. (The 
artist might have been engraver James Davis 
Cooper, who illustrated Charles Darwin’s 
1872 book Expression of Emotions in Man 
and Animals.) Under the masthead were the 
words ‘A Weekly Illustrated Journal of Science’. 
The banner, if not the subtitle, remained on 
Nature’s front page until just after the Second 
World War. 

Over the years, Nature adapted through its 
succession of editors, with, inrecent decades, 
‘sister’ journals carving out their own space 
in increasingly specialized scientific disci- 
plines. Images remained central throughout. 
For instance, in 1896, Nature published phys- 
icist Wilhelm R6ntgen’s first X-ray plates’; in 
the 1920s, maps to debate Alfred Wegener’s 
theory of continental drift?; and in 1968, the 
graphs that described astrophysicist Jocelyn 


Bell Burnell’s discovery of pulsars’. 

In some ways, the role of images in science 
publishing hasn’t changed much over the past 
150 years. Much scientific evidence takes the 
form of visualizations: illustrations, graphics 
and latterly photographs. What has shifted, 
inevitably, are the tools. Initially, Nature and 
other science journals featured monotone 
printed engravings. Now, its visual landscape 
is digitized, often mobile, presented in vivid 
colour and vastly expanded to reflect changes 
in technological capacity and science itself, 


as the cover of this anniversary issue attests. 

The late nineteenth century saw intense 
flux in scientific disciplines; boundaries 
between them were more porous. Images of 
discoveries sat cheek by jowl onjournal pages 
with diatoms and archaeological artefacts 
(such as 800-450 BC stone tools found in 
Scotland by archaeologist Robert Munro‘). 
Whereas many images are now used to inter- 
pret or visualize data, these early examples 
were mainly representations of scientific data 
—aphotograph of aneclipse, say, or a drawing 


A drawing of ‘Titanophasma Fayoli Brongniart’, published in Science in 1883. 
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The illustration on Nature’s inaugural cover, in 1869. 


of a geological formation. 

In Nature’s inaugural issue, such data-led, 
representational images had an important 
role. The journal’s first editor, astronomer 
Norman Lockyer, had co-discovered helium 
in 1868 using electromagnetic solar spec- 
troscopy. He illustrated his article ‘The 
Recent Total Eclipse of the Sun’ with two 
photographic images: a solar spectrum and 
an engraving derived from a photograph of 
asolar eclipse. 

These are not photographs as we would 
understand them. Before the 1890s, most 
images reproduced for journals were wood 
engravings, which were inked and set for 
printing alongside the typeset. To make them, 
the engraver would either copy by eye or lay 
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Drawing in the Magazine of Natural History. 
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a photograph directly onto the woodblock 
while carving. 


The art of precision 


Precision was centrally important. Ifa line on 
Lockyer’s spectrum was in the wrong place, 
it might suggest that the Sun is composed of 
calcium rather than hydrogen. So, to ensure 
accuracy, skill and collaboration were neces- 
sary. The era’s illustrators and engravers were 
often scientists themselves, or worked closely 
with researchers. Illustrators might even copy 
the image directly onto the block or plate 
ready for the engraver to do their work. 

In their book Objectivity (2007), science 
historians Lorraine Daston and Peter Gali- 
son describe such collaborative processes of 
image-making as “four-eyed sight”. Authors 
and image-makers worked together to shape 
and construct an observationally reliable 
image’. Similar collaborations were com- 
mon throughout the century. For the first 
issue of his Magazine of Natural History, for 
instance, botanist John Claudius Loudon had 
an engraver copy the prints from John James 
Audubon’s Birds of America (1827-1838). These 
worked as field guides for readers, even asthe 
magazine became a forum for debating new 
findings with an expert community. 

By the time Nature appeared, the model of 
a journal targeting a professional scientific 
community was emerging. Researchers might 
be ‘amateur’ naturalists who collected and 
described species, suchas the botanist Alfred 
William Bennett and cryptogamist MilesJoseph 
Berkeley, who sent images to Nature depict- 
ing the cause of ‘rust’ on wheat and barberry 
plants®. Or they might belong to the nascent 
class of university-based laboratory scientists 
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suchas physicist Peter Guthrie Tait, who sentin 
asketch visualizing his apparatus for measur- 
ing the wavelength of monochromatic light’. 
Specialist journals such as the /ournal of Phys- 
iology, launched in 1878, reflected emergent 
disciplines while also making space for both 
amateurs and professionals. 

Compelling imagery was becoming acom- 
petitive factor in this burgeoning marketplace 


“Compelling imagery was 
becoming a competitive 
factor in this burgeoning 
marketplace of ideas.” 


ofideas. The non-scientific //ustrated London 
News, launched in 1842, had established a 
precedent for capturing large readerships 
through abundant visuals. As I described in 
my book From a Photograph (2016), Nature 
responded to this pressure to some degree’®. 
Yet as historian Melinda Baldwin points out 
in Making Nature (2015), it wasn’t until 1890 
that the journal first made a profit’. The cost 
of images had a big effect on the bottom line. 
The geologist Edward Charlesworth, who took 
over ownership of the Magazine of Natural His- 
tory from Loudon, had to slim issues down to 
squeeze in more images on each page”. 
Nature, with its all-encompassing one-word 
title, also catalysed direct competition, suchas 
astronomer Richard Anthony Proctor’s Knowl- 
edge (subtitled ‘An Illustrated Magazine of Sci- 
ence’) and the US illustrated weekly Science. 
In 1883, one of the latter’s early editors, the 
entomologist and palaeontologist Samuel Hub- 
bard Scudder, published a description ofagiant 
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fossil stick insect discovered in coal deposits 
in France by another entomologist, Charles 
Brongniart. Its accompanying engraving (see 
page 25) effectively stitches together two pieces 
of observational data — the body and wings of 
the insect, separated in the coal bed. 

Such simple line engravings had become a 
staple. The same year in Nature, Canadian bot- 
anist Grant Allen published a series depicting 
the shapes of leaves, arguing that their shapes 
reflect levels of competition with other plants 
for access to energy sources. 

Between the 1880s and 1900, the old 
collaborations gave way to technological inter- 
locutors: photographers. Science journals 
viewed photography as a way of seeing that 
enabled “mechanical objectivity”. There was 
greater trust in the power of ground lenses and 
silver halides to capture the world ina way that 
the eye cannot. As with all visual technologies, 
however, it needed selection, organization and 
interpretation for data to be rendered into a 
comprehensible image. 

The work of French physiologist Etienne- 
Jules Marey is an iconic example. Following 
Eadweard Muybridge, who had captured ani- 
mal locomotion through ‘instantaneous pho- 
tography’ in the 1870s, Marey developed his 
own method: chronophotography. In an 1882 
issue of Nature, he described his ‘photographic 
gun’, which used a rotating photographic plate 
to take sequential images of a flying bird", 
helping to pave the way to understanding 
powered flight. Meanwhile, the Carte du Ciel 
project at the Paris Observatory, which ran 
from1887 to 1950, led to the creation of 22,000 
glass-plate negatives of stars from more than 
20 observatories”. 

Throughout the first half of the twentieth 
century, photography became crucial to sci- 
ence. Kathleen Lonsdale pioneered the form of 
crystallography in which X-rays are directed at 
asampleto measure diffraction and determine 
its atomic and molecular structure. Lonsdale 
published her 1928 findings on the benzene 
ring in the Proceedings of the Royal Society”. 
Her X-ray diffraction photographs — such as 
the 1941 series of eight, on diamonds — regu- 
larly appeared in Nature's pages”. 

The technique became crucial to the explo- 
sive discovery® of DNA’s structure by molecu- 
lar biologists James Watson and Francis Crick, 
published in Nature in 1953. The key piece of 
evidence, ‘Photograph 51’, showed the dif- 
fraction pattern of DNA and was taken, under 
the supervision of crystallographer Rosalind 
Franklin, by then-graduate student Raymond 
Gosling”. 

Photography was also used to disprove 
one of the biggest twentieth-century scien- 
tific hoaxes. In 1912, amateur archaeologist 
Charles Dawson claimed to have discovered 
the missing link between humans and apes in 
what looked like an early human skull found in 
Piltdown, Sussex. In 1913, the anatomist David 
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Volume Control 

David Owen Riverhead (2019) 

“For a deaf child, having hearing parents can be a serious handicap,” 
notes New Yorker staff writer David Owen in this sensitive study of 
hearing. (He is personally involved, as someone with tinnitus who 
saw his grandmother struggle with deafness.) Meshing the science 
with individual auditory experiences, Owen discusses hearing 

aids, cochlear implants, genetically deafened mice, sign language, 
Thomas Edison and noise levels in US cities and towns — allin 
absorbing, anecdotal detail, although regrettably with no diagrams. 
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Reality Ahead of Schedule 

Joel Levy Smithsonian (2019) 

This picture-packed volume by science journalist Joel Levy tours 
scientific advances sparked by ideas in science fiction. The title 
comes from a definition of sci-fi by Syd Mead, an industrial designer 
behind the look of futuristic movies such as Blade Runner (1982). 
But how prescient is sci-fi? Levy shows how H. G. Wells‘s 1903 story 
‘The Land Ironclads’ inspired Winston Churchill to promote the 
development of the military tank in 1915. But Wells did not envisage 
its key technical idea: caterpillar tracks, for added grip. 


Jet Stream 

Tim Woollings Oxford University Press (2019) 

The jet stream — strong high-altitude air currents — was discovered in 
the 1920s. In this analysis of its complex impact on weather, physicist 
Tim Woollings relates how in 1944, the Japanese used the jet stream 
to launch trans-Pacific incendiary balloons. By strange chance, one 
hit the US plant that provided plutonium for the bomb that devastated 
Nagasaki in 1945. Today, argues Woollings, the jet stream is “very 
likely” to be threatened by another product of human activity: rising 
carbon dioxide emissions. 
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Adventures of a Computational Explorer 

Stephen Wolfram Wolfram Media (2019) 

Computer scientist and businessman Stephen Wolfram, designer of 
the technical-computing system Mathematica, proffers good stories 
in this collection of autobiographical essays. In ‘Something | learned 
in kindergarten’, he recalls himself as a six-year-old spotting a bite 
taken out of the Sun: a solar eclipse, something unknown to the other 
children. In ‘My life in technology’, he recalls rejecting the Latin word 
mathematica, learnt at school, as too long and ponderous. Silicon 
Valley luminary Steve Jobs convinced him otherwise. 
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Lightspeed 

John C. H. Spence Oxford University Press (2019) 

Starting with Albert Einstein, scientific consensus holds that the 
speed of light is a universal constant. So writes physicist John Spence 
in his history of attempts to measure the speed of light. Spence 
considers the implications of its constancy for modern physics 

and technology. For instance, the aether — a theoretical space- 

filling medium rejected in Einstein’s relativity — is still “anything but 
empty”. Despite its appealing vignettes of great physicists, this is a 
challenging read. Andrew Robinson 
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Nature cover illustration from August 2019. 
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Waterson published a Nature article including 
three drawings taken from X-rays, revealing 
that the mandible of the Piltdown ‘skull’ was 
almost identical to a chimpanzee’s”. By the 
1950s, ‘Piltdown Man’ had been debunked. 
Nature had become a site for exposing bad 
science through visual evidence. 

By the mid-twentieth century, the long-term 
boom in science journals (there were already 
10,000 by 1900) was unabated, keeping pace 
with the growth in academic science and the 
proliferation of fields. Photographic technol- 
ogies remained central, but visual content was 
diverse and included graphs and early digital 
images. And starting inthe 1970s, technolog- 
ical imaging innovations allowed science to 
see further and deeper. The cryo-electron 
microscope, first announced in Nature”, 
allowed electron microscopy to be applied to 
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organisms by freezing and suspending them 
in an aqueous solution. (In 2017, its inventors 
—James Dubochet, Joachim Frank and Richard 
Henderson — won the chemistry Nobel prize 
for their work on the structure of viruses.) 
Theinvention of the charge-coupled device 
in 1969 meant that images could be captured 
onasilicon chip: photography had entered the 
digital realm. Digital-imaging sensors in tele- 
scopes have had a vast impact on astronomy. 
In 2018, Michael Koss and colleagues verified 


“Macro to micro, 
imaging today is 
exquisitely precise 
and often beautiful.” 
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the theory that black holes merge through 
the use of visual data from the Sloan Digital 
Sky Survey”°. And in 2019, the first image of 
a black hole was released, created using the 
Event Horizon Telescope. Nature was key to 
communicating these new technologies and 
providing a platform for debating them. 

Now, digital imaging is reaching further, 
with techniques such as hybrid multiplexed 
sculpted light microscopy under development 
to measure neuroactivity”, and NASA’s grand 
database of multi-wavelength images of the 
galaxy (see go.nature.com/2bdrua5). There 
are extraordinary shots of nebulae snapped 
by the Hubble Space Telescope, ‘extreme 
zoom’ images of single atoms, and the bur- 
geoning field of data visualization — graphical 
representations of data. 

Macroto micro, imaging today is exquisitely 
precise and often beautiful, able to capture 
worlds and structures far beyond the scope 
of human vision. The ‘Drawn Together’ cover 
image for the 8 August 2019 issue of Natureisa 
casein point. Crafted by illustrator Inna-Marie 
Strazhnik, it provides a visualization of work 
by bioengineer Tyler Ross and colleagues, who 
used light-activated motor proteins to move 
microtubules into networked structures”. 
Strazhnik translated the models, images and 
graphs in the paper into a dynamic, almost 
three-dimensional image. 

The visual continues to work as a founda- 
tion for making sense of data. The tools, as we 
have seen, have radically changed. The power 
of images has not. 


Geoffrey Belknap is a historian of science, 
photography and visual culture. He is head 
curator of the National Science and Media 
Museum in Bradford, UK which is a part of the 
Science Museum Group. 
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Science must move 
with the times 


Philip Ball 


Research cannot fulfil its 
social contract and reach new 
horizons by advancing on the 
same footing into the future, 
argues Philip Ball in the last 
essay of aseries on how the 
past 150 years have shaped 
today’s science system. 


n 1866, three years before the first issue 
of Nature was published, a transatlantic 
telegraph cable established light-speed 
communication between Great Britain and 
North America. The triumph won William 
Thomson (later Lord Kelvin) a knighthood for 
the scientific advice he had given to the project. 
Yet Thomson had also advised on a disastrous 
earlier attempt in 1858 that barely worked from 
the outset and deteriorated within weeks. 
Itwas partlyinresponsetothat costly debacle 
that the Cavendish Laboratory was established 
at the University of Cambridge, UK, inthe early 
1870s, to provide the nation’s future engi- 
neers with a better grounding in physics. The 
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first director was James Clerk Maxwell, whose 
electromagnetic theory of the mid-1860s led 
tothe discovery of radio waves in 1887 — which 
soonenabled ‘wireless’ telecommunicationand 
rendered the telegraph obsolete. 

In such ways, the distinctly Western and 
specifically British world into which Nature 
was launched regarded fundamental scientific 
researchas the engine of socially transformative 
industrial innovation. Emanating from London, 
Norman Lockyer’s journal showcased those 
developments from the perspective of a British 
Empire that grewto encompass about one-fifth 
of the world’s population by the century’s end. 
The benefits of research laboratories and thesys- 
tematic institutionalization of science, in both 
academia and industry, were beyond doubt for 
Nature's target audience. 

Eight decades later, this model motivated 
Vannevar Bush’s 1945 report to US president 
Franklin D. Roosevelt. Science — The Endless 
Frontier made the case for governmental 
support of basic science research to promote 
national security, public health and welfare. 
It led to the establishment of the US National 
Science Foundation, and it appealed to the 
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Comment 


optimistic and simplistic vision of science as 
a quest that, motivated by curiosity and guar- 
anteed freedom of enquiry, would serve the 
interests of the nation and of humankind. 

Science — whether it is Maxwell’s electro- 
magnetism, the Manhattan Project thatinspired 
Bush, or the Human Genome Project — has 
indeed been so socially transformative that its 
intellectual and technological machinery has 
gained seemingly irresistible momentum. Is 
this not how progress is made, and is that not, 
on balance, a good thing? 

Evento ask the question invites familiar and 
polarized arguments. Some commentators 
question the wisdom of unfettered scientific 
development, pointing to the problems of cli- 
mate change and environmental despoliation, 
nuclear weapons and antibiotic resistance, 
along with the ambivalent influence of artifi- 
cial intelligence and robotics, information tech- 
nologies and genetic engineering. Others point 
out that quality-of-life indicators — lifespan and 
infant mortality, say — have improved steadily 
(if unevenly, geographically and temporally) 
during the era of modernscience that roughly 
coincides with the span of Nature’s existence. 

But Manichean views and tropes of ‘dual use’ 
miss the point. Some of the key questions that 
confront science today are about whether its 
methods, practices and ethos, pursued with very 
little real change since Maxwell's day, are fit for 
purpose inthe light of the challenges — concep- 
tual and practical — we now face. Can science 
continue to fulfilits social contract and to reach 
new horizons by advancing onthe same footing 
intothe future? Or doessomething need to shift? 


Looking out 

Let’s consider where we stand. The convention 
of the past century or so has tended to place 
the frontiers of knowledge at the scales of the 
very large and very small. Today we might 
be inclined to add the very complex — which 
typically pertains to the intermediate scales of 
direct human experience. 

It’s now clear that challenges at the two 
extreme scales — fundamental particles and 
cosmology — are related. As the island of 
knowledge grows, so does the perimeter of the 
horizon where knowledge ends, says Marcelo 
Gleiser, a particle cosmologist at Dartmouth 
College in Hanover, New Hampshire. “The more 
we know, the more exposed we are to ourigno- 
rance, and the more we know to ask’, hewrites’. 

We have known for only several decades that 
dark matter outweighs all visible matter bya 
factor of five, yet we are no closer to knowing 
what it consists of. And scarcely two decades 
have passed since the mysterious entity dubbed 
dark energy, which causes the Universe’s 
expansion to accelerate, has been recognized 
to comprise more than two-thirds of the total 
cosmic energy density. Never before has our 
knowledge of the Universe seemed so deficient. 

Plugging these gaps at the largest scales will 
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depend on elucidating the physical world at 
the smallest. Here the prospects are currently 
dim enough to cause desperation and even ran- 
cour. The world’s largest particle accelerator, 
the Large Hadron Collider at CERN near Geneva, 
Switzerland, has so far failed to offer any hint 
of howto proceed beyond known physics. Ele- 
gantideas look moribund inthe face of an ugly 
lack of facts. Inthe meantime, models are being 
forced towards ideas, such as the multitude of 
universes now permitted by the inflationary 
model of the Big Bang, that seemtosome critics 
to abandon the empirical basis of science itself. 

Yet even as our view of the Universe becomes 
increasingly perplexing, itis being fleshed out 
as never before. In the 1860s, it was almost 
casually assumed that life would be common 
on other worlds. H. G. Wells’s 1897 novel The 
War of the Worlds (informed by his reading of 
Nature) seemed all the more chilling because 
of the widespread belief — which persisted for 
another half-century — that there was indeed 
life on Mars. Seasonal changes of surface colour 
were interpreted as vegetation growth, and 
striations described by astronomer Giovanni 
Schiaparelliwere notoriously ascribed by others 
to artificial waterways. 

Butthe barren, sterile Martian landscape that 
the Viking landers revealed in 1976 confirmed 
a growing sense — stoked by the Apollo Moon 
landingsand reflected in physicist Enrico Fermi’s 
famous question about the apparent absence of 
alien visitations — that we are a lonely outpost 
inableak, lifeless cosmos. Well, no longer. Since 
the first discovery of an extrasolar planet orbit- 
ing a Sun-like star was reported in this journal? 
in1995, around 4,000 sightings of such planets 
have now accumulated (and a2019 Nobel prize). 

Itseems that planetary systems are the norm 
for other stars, and Earth-like planets far from 


“It’s still unclear when 
or whether wecan 
exclude ourselves from 
the scientific frame.” 


uncommon. Already we knowalittle about the 
atmospheres of some of these worlds. With the 
launch of NASA’s Transiting Exoplanet Survey 
Satellite last year, and the James Webb Space 
Telescope scheduled to launch in 2021, we will 
soon know much more. Researchers are now 
speaking plausibly about deducing within a 
lifetime if there is life elsewhere. 

Where does all this leave us? The cosmolog- 
ical perspective could seem to perpetuate the 
sense of an unfolding Copernican revolution 
in making humankind even more peripheral. 
Notjust aninsignificant dot ina vast Universe, 
we're possibly an insignificant universe ina 
potentially infinite multiverse. It’s hard to 
imagine a demotion more extreme. 

There is another view that is anything 


© 2019 Springer Nature Limited. All rights reserved. 


but Copernican. Here, habitable worlds are 
ubiquitous and we remain uncomfortably, 
almost absurdly, at the centre of things. Inthe 
inflationary multiverse, our presence is the 
explanation for the fundamental constants 
of nature. They might have different values in 
other universes, but the conditions necessary 
for our existence guarantee that we will see the 
ones we do. 

The foundations of quantum mechanics 
(a topic once disreputable that now verges 
on fashionable) muddy the picture too. The 
‘many worlds’ interpretation is more popular 
today than when US physicist Hugh Everett 
proposed itin the 1950s. It multiplies universes 
(in a manner distinct from the cosmological 
multiverse) and it multiplies each of ‘us’ beyond 
measure. Meanwhile, US theoretical physicist 
John Wheeler’s ‘participatory universe’ and new 
interpretations suchas QBism insist that quan- 
tum theory requires the observer’s presence — 
rather than being the abstract and objective 
framework that science usually supplies. 

These ideas remain speculative. But they 
challenge the Newtonian promise of an 
impersonal mechanics. 


Looking in 

Inother words, it’s still unclear when or whether 
we can exclude ourselves from the scientific 
frame. This would have been no surprise to 
Maxwell. His conception of physical reality 
was predicated (no less than was Newton’s) on 
a religious position that awarded humanity a 
special place. 

This, of course, is where Charles Darwin 
also enters the frame. His ideas on evolution 
by natural selection, published in On the Origin 
of Species (1859) were still causing shock waves 
when Nature was founded. Two years after that, 
he delivered the final bombshell in The Descent 
of Man (1871). The significance of his ideas was 
notas an explosive charge placed underneath 
the church butas the opening salvotoacentury 
and a half of debate about what it means to be 
human. If there was a struggle, it was not about 
which book to consult but about who had the 
most decisive authority. Within science, first 
evolutionary theory, then psychoanalysis, and 
now genetics and neuroscience, have all staked 
their claims. 

On Nature’s centenary, you might have 
placed your bets with the latter disciplines. Half 
acentury later, itis less clear that they can offer 
the last word. Powerful new techniques applied 
to rapidly growing data sets, such as genome- 
wide association studies‘, have disclosed a 
clear and sometimes strong genetic compo- 
nent to almost every human behavioural trait 
we choosetostudy, as wellas influencing health 
and disease. But a mechanistic understanding 
of genetic effects often remains remote. And 
for traits in which many — perhaps even several 
thousand — genes are implicated, it isnot even 
clear if this is the right level at which to ascribe 
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The first six primary mirror segments for the James Webb Space Telescope. 


" N 


causes for what we can see and measure. 

The emerging picture of development and 
tissue function at the level of single-cell tran- 
scription (and perhaps soon of translation) 
adds a new layer of complexity’. Apparently 
identical cells in the same tissue can show a 
wide range of dynamic states of gene expres- 
sion. It might be that the genome tells us no 
more about how an organism builds and sus- 
tains itself than a dictionary does about how 
a story unfolds. New methods, rather than 
finally answering old questions, could merely 
beggar them, shifting the goalposts entirely — 
as genomics itself has done for notions of race. 

Neuroscience, like genetics, has been 
restricted in the questions it can ask by the data 
it can gather. Functional magnetic resonance 
imaging remains a blunt tool, showing where 
things are happening in the brain (at rather 
coarse-grained resolution), but not what tran- 
spires. The idea that the human brain might be 
understood by exhaustive documentation and 
perhaps simulation of neuronal connections 
and firing patterns was challenged as soonas it 
was mooted (by the ill-fated European Human 
Brain Project®). 

Here we arrive at one stretch of the 
‘complexity’ frontier. If history is any guide, 
we should expect that understanding these 
complex systems will not emerge by drawing 
analogies with the latest cutting-edge technol- 
ogies. Just as the brain is not (as was thought in 
the early nineteenth century) abattery, neither 
is ita computer; nor is the genome a digital list 
of parts. And more data, although extremely 
valuable as a resource, will not help us without 
new ideas. These are in short supply. As neuro- 
biologist and historian Matthew Cobb at the 
University of Manchester, UK, writes, “no major 
conceptual innovation has been made in our 


overall understanding of how the brain works 
for over halfacentury””. 

It’sno surprise, then, that the ‘hard problem’ 
of consciousness is barely articulated, let alone 
understood. Weare still at the stage where seri- 
ous thinkers on the topic embrace the gamut 
of positions, from regarding it as an illusion 
to considering it the only valid starting point 
for atheory of human experience. That latter 
view harks backtohowUS psychologist William 
James ignored “the traditional antithesis 
between reality and appearance”, as Nature put 
itin1915 (ref. 8). As for claims that neuroscience 
has banished free will (for example, because 
decisions can be predicted from brain scans 
in advance of their conscious manifestation), 
saying that “your brain decides before you do” 
merely returns us to British philosopher Gilbert 
Ryle’s famous regression of mental homunculi’. 


New views 


Among the ways in which science has changed 
over the past century and a half, three loom 
large. First, itis no longer driven by lone figures 
labouring in their laboratories, but has become 
ateam effort that spans labs, departments, dis- 
ciplines, institutions and continents. Second, it 
often relies now on datasets so vast that human 
brains cannot hope to hold or parse them all. 
Third, it increasingly confronts issues of global 
reach and even existential urgency — from cli- 
mate heating andthe need for acarbon-neutral 
economy, to epidemics and water security. 
Yet these changing demands are not 
reflected inincentives, funding mechanisms, 
awards or popular narratives. Systemic 
biases — for example, in barriers to the entry 
and advancement of women and people from 
minorities, or in the demographic coverage 
of medical databases, or the prejudices that 
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algorithms inherit from their makers — remain 
entrenched. Even science’s internationalism 
is threatened by current political trends. To 
regard what biologist Thomas Henry Huxley 
in Nature’s first issue called the “progress of 
Science” as an inexorable, triumphant forward 
march, today seems dangerously complacent. 

It is time to ask whether such problems are 
not imperfections of the system but conse- 
quences of it. Science might be hindered by 
channelling its practitioners into asingle mode 
of thinking. There is hubris inthe assumption 
that the traditions, conventions, training, 
disciplinary boundaries, methods, responsi- 
bilities and social contract that crystallized in 
thenineteenth century froma highly restricted 
demographic must still be the best way of 
working. To say as much is not to submit to 
some trendy caricature of postmodernism. 
Rather, it is to acknowledge that there are 
assumptions embedded, often invisibly, inthe 
way we develop models, deploy metaphors, 
apportion priorities, recognize and reward 
achievement, and recruit participants that 
must be questioned. 

The canonical scientific article, with its 
unified and passive voice, its closed and 
self-contained narrative, its seductively con- 
fident diagrams and standardized format, and 
its eventual metric quantification of impact, is 
not the only or the best vehicle for translating 
and disseminating today’s research: for posing 
and then answering questions. There’s scope for 
more variety in who does this, and how. Who 
would have guessed, for example, that what was 
needed to finally put climate science firmly on 
the public agenda was the candour and courage 
of aschoolgirl whois onthe autistic spectrum? 

The history of science tells us that some of 
the toughest questions will be addressed not 
by being answered but by being replaced with 
better questions. Among those haunting us 
today that might deserve this fate are: what is 
life? What is consciousness? What makes indi- 
viduals who they are? Why does our Universe 
seem fine-tuned for our existence? How did it 
all begin? It will take creative and diverse think- 
ing toimprove on them — for the view over the 
horizon might not be the one we anticipated. 
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Nature’s reach: narrow 
work has broad impact 


Alexander J. Gates, Qing Ke, Onur Varol and Albert-Laszl6 Barabasi 


Ascientific paper today is 
inspired by more disciplines 
than ever before, shows a new 
analysis marking the journal’s 
150th anniversary. 


ow knowledge informs and alters 

disciplines is itself an enlightening, 

and vibrant field'. This type of meta 

research into new findings, insights, 

conceptual frameworks and tech- 
niques is important, among other things, for 
policymakers who fund research in the hope 
of tackling society’s most pressing challenges, 
which inevitably span disciplines. 

Since its founding in1869, Nature has offered 
a venue for publishing major advances from 
many fields. To mark its anniversary, we track 
here how papers cite and are cited across 
disciplines, using data on tens of millions 
of scientific articles indexed in Clarivate 
Analytics’ Web of Science (WoS), a bibliometric 
database that encompasses many thousands of 
research journals starting from 1900. We pay 
particular attention to articles that appeared 
in Nature. In our view, this snapshot, for all its 
idiosyncrasies, reveals how scientific work is 
ever more becoming a mixture of disciplines. 

Several caveats are important. The volatility of 
our metricsinthe early twentieth century canbe 
attributed, atleast in part, to the fact that articles 
then typically had many fewer references and 
citations. Until the mid-1920s, Nature articles 
typically listed no references; today, they can 
have up to 50. Another caveat is that the number 
of disciplines recognized by WoS grew from 57 
in1900to 251in1993, but this is only one factor 
contributing to the disciplinary trends we found. 

Many scholars have developed methods 
and metrics to gauge how scientific publish- 
ing contributes to knowledge, and to assess 
influence. For more detailed explanations of 
our choices, along with essential qualifications, 
see Supplementary Information (SI). 

Across the scientific literature overall, our 
analysis hints that articles are drawing from 
and influencing more disciplines than they did 
100 years ago, although some disciplines have 
broader influence than others. As ajournal, 
Nature publishes mostly specialized, or deeply 
disciplinary, papers; these tend to reference 
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anarrower range of disciplines than does the 
average paper. Usually, however, Nature papers 
are cited by a broader range of disciplines than 
average. 


Colossal corpus 


We extracted references for papers contained 
inthe WoS publication database from 1900 to 
2017, capturing close to 700 million citation 
relationships. We pinned subsequent analysis 
to the approximately 19 million articles that 
had at least one reference and one citation 
and that were published before 2010 (to give 
time for citations to accumulate). The resulting 
corpus integrated the discipline information 
for 38 million articles. 

Toidentify disciplines, we relied on relatively 
broad categorizations from WoS. These are 
necessarily imperfect, but cumulatively reveal 
patterns of scholarship. Mostjournals are disci- 
plinary, and so WoS assigns each article to one 
or more disciplines onthe basis of the journal 
in which it is published. For instance, articles 


inthe Journal of Bacteriology are categorized 
as microbiology. 

We traced the conceptual journeys to each 
paper by identifying the inspiration for articles 
by their references: the works authors credited 
for their concepts, methods, techniques and 
insight. Similarly, we identified the impact of 
each publication by the citations it received 
inthe corpus. Caution is required when using 
citation-based measures to assess the impor- 
tance of individual papers or authors; still, the 
accessibility and quantity of such data provide 
one view — among many — of howscientific 
knowledge accumulates’. 

We explored how the 88,637 Nature articles 
in our data set mediate the metabolism of ideas 
using the broadest WoS disciplinary categories. 
A Nature article with references mainly from 
biomedical research will typically collect the 
largest proportion of its citations from other 
biomedical-research papers (see ‘Knowledge 
flows’). About half of the papers that cite it will 
be spread across the other categories. By con- 
trast, a paper with references mainly from engi- 
neering and technology is much more likely 
to be cited by papers in other fields (72%) than 
by other papers in the same field (28%). Engi- 
neering and technology papers also make up 
avery small proportion of the papers Nature 
opts to publish; those that are selected might 
be chosen for their broad appeal. At the other 
extreme, papers in Earth and space science are 
much more likely to be cited by papers in their 
own field (72%) than by other disciplines (28%). 


CO-CITATION NETWORK 


Each Nature paper is a dot. Dots are linked if another paper cites both. Some articles (colourful clusters) 
are cited by many disciplines, others (monotone areas) are deeply embedded in their own disciplines. 
(See go.nature.com/n15Oint for an interactive version, including references to the six highlighted papers.) 


Discipline @ Clinical medicine @ Mathematics 
Arts Earth and space Physics 
Biology @ Engineering and technology Business and management 
Biomedical research Health Psychology 
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Another way to reveal intrinsic communities 
inand across disciplines is through co-citation 
analysis”. In this approach, each paper is repre- 
sented by a node, shown as a dot. Two papers 
are linked if another paper cites both of them; 
the node size reflects the number of co-cita- 
tions. Our visualization algorithm treats each 
link as a spring and arranges the nodes to make 
links as short as possible. This produces clus- 
ters of Nature papers that vary in their level of 
interdisciplinary connections (see go.nature. 
com/n15Oint). 

The overall network structure echoes 
scientific perceptions of how publications 
relate to each other. Articles tend to bunch 
together according to age and topic, because 
authors usually reference recent articles 
related to their paper’s subject®. Over its recent 
history, more than half of Nature’s papers have 
come from the life sciences. Consequently, 
clusters of biomedical-research papers 
appear throughout the network. Since 1930 
(when it became reliable to use references to 
assign papers to disciplines), the proportion of 
physics papers has shrunk and Earth and space 
science has grown. Certain papers — such as 
the discovery of the first exoplanet orbiting 
a Sun-like star* — are deeply embedded ina 
cluster of papers inthe same field. By contrast, 
the discovery of the ozone hole’ is ina region 
where articles of many disciplines — chemistry, 
social sciences, Earth sciences — are found (see 
‘Co-citation network’). Our analysis shows that 
this paper’s references are more diverse than 
95% of Nature papers, andits citations are more 
diverse than 99% of Nature papers. 

Ananalysis of the co-citation network from 
any more-specialized journal would probably 
look different. Still, distinct episodes from 
the history of science are apparent in the 
3D view of Nature’s co-citation network (see 
go.nature.com/2patums). These include the 
study of radioactive elements in the 1930s, 
and how studies of superconducting materi- 
als flirted with diverse applications and then 
were intensely characterized deep within the 
physical sciences in the late 1980s and 1990s. 


Over time 


The numbers of papers inevery discipline grew 
exponentially over the past century’. Exact 
rates differ over time, although since about the 
1960s, 48% of papers were in the life sciences 
(with 42% from ‘hard’ sciences and 10% from 
behavioural science). 

Scholars define and measure influences 
across disciplines in various ways. Multidiscipli- 
narity usually refersto separate disciplines com- 
ing together yet remaining distinct: we define 
it forjournals as the breadth of disciplines that 
are either inspiring or being impacted by the 
journal’s articles. Interdisciplinarity refers to 
integration: we defineit as the diversity in inspi- 
rationin an article’s references, and the diver- 
sity in how an article’s impact diffuses across 


KNOWLEDGE FLOWS 


Nature articles are mainly cited by their own disciplines, particularly in some fields, such 
as Earth and space science. (Each Nature paper was assigned to a discipline using its 
references, as was every paper in the Web of Science database that cited a Nature paper.) 
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disciplines. Although it is difficult to assess 
integration across an article’s citations, this 
measure can capture how the knowledge com- 
municated by the article had diverse impact®. 
This analysis indicates the extent of interactions 
across disciplines, but does not reveal the spe- 
cific details of how those disciplines interact. 

First, we explored the breadth of disci- 
plines reflected in the references and cita- 
tions across ajournal, capturing the journal’s 
multidisciplinarity (see ‘Inspiration and 
impact’). We labelled each paper in a journal 
with the primary discipline assigned to its 
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references (inspiration) or citations (impact), 
and measured multidisciplinarity on a scale 
of zero to one. Zero meant that all of an arti- 
cle’s references or citations were in the same 
discipline; one meant that they were balanced 
evenly across all disciplines, using the normal- 
ized entropy measure (see SI). We found that 
this measure does not depend onthe number 
of articles each journal published (see SI). It 
probably reflects other qualities of a journal, 
suchas the pool of articles submitted and the 
editors’ selection criteria. 

For most journals, the breadth of impact and 
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inspiration are highly correlated. This holds 
true for specialist journals such as Cell and 
Physical Review Letters. A typical journal today 
publishes articles inspired by and impacting 
about six disciplines. 

The general-science journals Nature and 
Science both have a greater breadth of impact 
(citations) and inspiration (references) than 
99.7% of other journals. The multidisciplinarity 
of Nature peaked inthe 1960s and has remained 
relatively high since then, probably reflecting 
a combination of papers selected by Nature 
that are expected to have broad appeal, and 
the papers’ greater visibility to the scientific 
community. 

Second, we explored the interdisciplinarity 
of individual articles by measuring the diver- 
sity of disciplines in the references and cita- 
tions’ ’°. Many measures have been proposed 
to assess interdisciplinarity, and can have 
inconsistent results (see, for example, refs 
11,12). Scholars agree, however, that simply 
counting the number of disciplines that occur 
in references and citations is inadequate. For 
example, a paper that largely references biol- 
ogy and clinical science draws on less diversity 
than one inspired by biology and physics. We 
quantify this characteristic ona scale of zero 
to one using the Rao-Stirling diversity index, 
which captures the number of disciplines 
represented, how evenly they are distributed 
and their degree of difference’. 

Our analysis shows that the diversity of dis- 
ciplinesin articles’ references and citations is 
increasing. Roughly speaking, atypical article 
is inspired by and impacts three times more 
disciplines this decade than it did 50 years ago. 

Whereas a typical article published today 
references articles fromthe equivalent of 11 dis- 
ciplines, a Nature publication references the 
equivalent of only 9 (SI, Fig. S5). This is in line 
with previous analyses suggesting that highly 
influential work tends to be grounded in deep 
expertise”. By contrast, the disciplinary diver- 
sity for the citations of articles in general-sci- 
encejournals has consistently been higher than 
for articles published elsewhere, suggesting 
that content inthesejournals reaches a broader 
swathe of the scientific community than it drew 
from. This observation makes sense, consider- 
ing that these journals aim to reach a broader 
readership and to publish major advances. 

Sometimes, the fields that inspire a paper 
differ markedly from those on which it has an 
impact. For example, ‘The Digital Code of DNA’, 
a2003 Nature essay by systems biologists Leroy 
Hood and David Galas®, takes most ofits inspira- 
tion from molecular biology, yet is cited across 
computer science, clinical medicine and social 
science. We quantify cross-disciplinarity ona 
scale from zero to one. Inthis case, zero implies 
alldisciplinesthatinspired an article andallthose 
it impacts are identical; a score of one implies 
these lists differ completely (using the Jensen- 
Shannon divergence, ameasure of the similarity 
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INSPIRATION AND IMPACT 


The diversity of disciplines in articles’ citations (impact) 
and references (inspiration) is growing; the likelihood of 
articles crossing disciplines is not. Articles in Nature 
and Science are more broadly cited across disciplines. 


Interdisciplinarity 

How many, how diverse and how balanced 
disciplines are across an article’s references and 
citations. This is growing across all of science. 
— Nature Science Phys. Rev. Lett. 
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between two probability distributions; see SI). 

What we see is that in recent decades 
cross-disciplinarity has declined, with that 
of the general-science journals falling faster 
than the scientific literature overall. Perhaps 
this is because articles that bridge disciplines 
influence multiple fields, including those from 
which they arose. As works draw ona broader 
set of disciplines, there is less scope to influ- 
enceaset of completely different disciplines. 

Assessment of scientific work generally 
works best when contextualized within its 
specific discipline. For example, citation 
counts are more effective when comparing 
biomedical papers to other biomedical papers 
rather than to physics papers. But if interac- 
tions between disciplines are increasing, then 
astringent, coherent assignment makes less 
sense. We speculate that considering how disci- 
plines intermix within individual articles might 
allow better comparisons across disciplines or 
improveassessment ofa paper's impact. What’s 
more, strictly structured research departments 
and funding programmes make less sense if 
boundaries between disciplines are becoming 
less distinct. As network scientists, we relish 
the idea that science is becoming less siloed. 

The increase we observe in interdisciplinary 
thinking is seen across disciplines (see SI) and 
shows no signs of slowing. With the popula- 
tion of researchers, scientific literature and 
knowledge ever growing, the scientific endeav- 
our increasingly integrates across boundaries. 
Research institutions and funding bodies 
would do well to realize that interdisciplinarity 
is becoming the norm. 
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Highlights from 150 years of Nature 


10 extraordinary papers 


Genetics 


The structure of DNA 


Georgina Ferry 


In the early 1950s, the identity of genetic material was still 
a matter of debate. The discovery of the helical structure 
of double-stranded DNA settled the matter — and changed 


biology forever. 


On 25 April 1953, James Watson and Francis 
Crick announced! in Nature that they “wishto 
suggest” a structure for DNA. In an article of 
just over a page, with one diagram (Fig. 1), they 
transformed the future of biology and gave 
the world an icon — the double helix. Recog- 
nizing at once that their structure suggested a 
“possible copying mechanism for the genetic 
material”, they kick-started a process that, over 
the following decade, would lead to the crack- 
ing of the genetic code and, 50 years later, to 
the complete sequence of the human genome. 

Until that time, biologists had still to be 
convinced that the genetic material was 
indeed DNA; proteins seemed a better bet. Yet 
the evidence for DNA wasalready available. In 
1944, the Canadian-US medical researcher 
Oswald Avery and his colleagues had shown? 
that the transfer of DNA froma virulent toa 
non-virulent strain of bacterium conferred 
virulence on the latter. And in 1952, the biol- 
ogists Alfred Hershey and Martha Chase had 
published evidence’ that phage viruses infect 
bacteria by injecting viral DNA. 

Watson, a 23-year-old US geneticist, arrived 
at the Cavendish Laboratory at the University 
of Cambridge, UK, in autumn 1951. He was 
convinced that the nature of the gene was 
the key problem in biology, and that the key 
tothe gene was DNA. The Cavendish was a phys- 
ics lab, but also housed the Medical Research 
Council’s Unit for Research on the Molecular 
Structure of Biological Systems, headed by 
chemist Max Perutz. Perutz’s group was using 
X-ray crystallography to unravel the structures 
of the proteins haemoglobin and myoglobin. 
His team included a 35-year-old graduate stu- 
dent who had given up physics and retrained 
in biology, and who was much happier working 
out the theoretical implications of other peo- 
ple’s results than doing experiments of his own: 
Francis Crick. In Crick, Watson found a ready 
ally in his DNA obsession. 


Figure 1| The DNA double helix. This drawing 
appeared in Watson and Crick’s report! of 

the structure of DNA, and was produced by 
Crick’s wife, Odile. 


However, DNA was the project of Maurice 
Wilkins at King’s College London. Crick was a 
friend of Wilkins’s, and it wasn’t the done thing 
for labs to compete over the same molecule. 
Moreover, the experienced X-ray crystallo- 
grapher Rosalind Franklin had just taken over 
experimental work on DNAat King’s. Owing to 
amisunderstanding about their relative roles, 
Franklin’s relationship with Wilkins was frosty. 

None of this stopped Watson and Crick 
from speculating about how the com- 
ponents of the DNA molecule — the four 


© 2019 Springer Nature Limited. All rights reserved. 


nucleotide bases adenine, guanine, thymine 
and cytosine, connected to a backbone of 
sugars and phosphates — might assemble into 
fibres. They thought that a helix was a likely 
option: the US chemist Linus Pauling and his 
co-workers had just demonstrated‘ that pep- 
tide chains formed a-helices. Crick himselfhad 
co-authored a paper on the theory of diffrac- 
tion of X-rays by helices®. In late 1951, he and 
Watson combined that theory with what they 
knew about the chemistry of DNA, and what 
they remembered of talks given by Wilkins and 
Franklin, to build a model of the DNA structure. 

They gotit badly wrong: Wilkins and Franklin 
quickly demolished it. The head of the Caven- 
dish, Lawrence Bragg, was furious, and banned 
Watson and Crick from doing any further work 
onDNA. But then, in February 1952, the Caven- 
dish team received a manuscript from Pauling 
that contained a DNA model. It was wrong, but 
Watson and Crick were alarmed that Pauling 
was potentially near a solution. 

This time, Bragg agreed that they might try 
to get there first. Franklin was soon to move 
to Birkbeck College, London, and was leaving 
the DNA work to Wilkins. She and her graduate 
student, Raymond Gosling, had given Wilkins 
a photograph of the X-ray-diffraction pattern 
produced by the B form of DNA. Watson went to 
see Wilkins, who showed him the photograph, 
without Franklin and Gosling’s knowledge. 

The now famous ‘Photograph S51’, together 
with other unpublished data of Franklin’s that 
Perutz had shown Watson and Crick, told the 
pair that DNA did indeed form a helix, and that 
the structure consisted of two chains running 
in opposite directions. Watson was stumped, 
however, over how the bases could pair up 
between the two. He made cardboard cutouts 
of the bases, trying to fit them together, but 
nothing seemed to work. 

His colleague Jerry Donohue then pointed 
out that he was using the molecular struc- 
tures of the enol isomers of the bases, which 
cannot form the hydrogen bonds necessary 
for base-pairing. Once Watson had made cut- 
outs of the alternative keto isomers, he had the 
blinding revelation that when guanine bonded 
to cytosine, it made an identical shape to that 
of adenine bonded to thymine, and that the 
shapes fitted perfectly into the helical frame 
provided by the backbones of each DNA chain. 
This explained biochemist Erwin Chargaff’s 
discovery that the DNA of any species has 
the same amount of guanine as of cytosine, 
and of adenine as of thymine®. It also showed 
that each DNA chain ina helix provides a per- 
fect template for the other, reading the base 
sequence in opposite directions. 
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10 extraordinary papers 


Within days, Watson and Crick had built a 
new model of DNA from metal parts. Wilkins 
immediately accepted that it was correct. It 
was agreed between the two groups that they 
would publish three papers simultaneously in 
Nature, with the King’s researchers comment- 
ing on the fit of Watson and Crick’s structure 
to the experimental data, and Franklin and 
Gosling publishing Photograph 51 for the 
first time’®. 

The Cambridge pair acknowledged in their 
paper that they knew of “the general nature 
of the unpublished experimental results and 
ideas” of the King’s workers, but it wasn’t until 
The Double Helix, Watson’s explosive account 
of the discovery, was published in 1968 that 
it became clear how they obtained access to 
those results. Franklin had died of cancer a 
decade previously; her death prevented her 
from sharing the Nobel prize awarded to 
Watson, Crick and Wilkins in 1962. 

The immediate reception of the double-he- 
lix model was surprisingly muted’, perhaps 
because there was no obvious mechanism 
to explain its role in protein synthesis. Ina 
landmark talk in 1957, Crick proposed that 
the base sequence encoded the sequence 
of amino acids in a protein, and that protein 
production involved RNA both as a template 
and as an ‘adaptor’ that would enable amino 
acids to be attached to one another in the right 
order. He also supported the suggestion — 
originally made informally by the physicist 
George Gamow to the members of the ‘RNA 
Tie Club’ convened by Gamow and Watson, 
but also independently proposed by biolo- 
gist Sydney Brenner” — that triplets of bases 
(which Brenner called codons) encode the 
20 amino acids commonly found in proteins. 
Finally, Crick expounded what he called the 
‘central dogma’ of biology: that information 
can flow from nucleic acids to proteins, but 
not the other way round”. 

These predictions were confirmed by 
experiment in the next few years. In 1958, the 
biochemists Matthew Meselson and Franklin 
Stahl showed that one DNA strand acts asa 
template for the formation of a new strand”. 
The same year, Arthur Kornberg and his 
colleagues published their discovery of the 
enzyme DNA polymerase”, which adds bases 
to newly forming strands. Messenger RNA, 
transfer RNA and ribosomal RNA were all 
quickly identified. 

In 1961, Marshall Nirenberg and Heinrich Mat- 
thaei were the first to crack part of the genetic 
code, demonstrating that bacterial extracts 
synthesize only the amino acid phenylalanine 
from RNA that contains just one type of RNA 
base" (uracil; U). The same year, Crick, his indis- 
pensable female technician Leslie Barnett and 
their co-workers reported mutation studies that 
confirmed the existence of the triplet-based 
code’, and which therefore suggested that the 
codon for phenylalanine was UUU. The race to 
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identify the full set of codons was completed 
by 1966, with Har Gobind Khorana contributing 
the sequences of bases in several codons from 
his experiments with synthetic polynucleotides 
(see go.nature.com/2hebk3k). 

With Fred Sanger and colleagues’ publica- 
tion” of an efficient method for sequencing 
DNA in 1977, the way was open for the com- 
plete reading of the genetic information in 
any species. The task was completed for the 
human genome by 2003, another milestone 
inthe history of DNA. 

Watson devoted most of the rest of his 
career to education and scientific administra- 
tion as head of the Cold Spring Harbor Labo- 
ratory in Long Island, New York, and serving 
(briefly) as the first head of the US National 
Center for Human Genome Research, nowthe 
National Human Genome Research Institute. 
Always outspoken, he was eventually removed 
from his emeritus position at Cold Spring Har- 
bor when he repeatedly aired controversial 
opinions about genetics, race and intelligence. 

Crick continued to tackle hard problems in 
science, moving in 1977 from Cambridgeto the 
Salk Institute in La Jolla, California, where he 
spent the rest of his life working on the neural 
basis of consciousness” and, specifically, of 
visual perception. He died in 2004, aged 88. 

The double helix put genetics on a 
physical footing that would shed light on 
almost every aspect of modern biology and 
medicine. Examples include the migration of 
human populations throughout history; ecol- 
ogy and biodiversity; cancer-causing muta- 
tions in tumours and their drug treatment; 
surveillance of microbial drug resistance in 
hospitals and the global population; and the 
diagnosis and treatment of rare congenital dis- 
eases. DNA analysis has long been established 
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in forensics, and researchinto more-futuristic 
applications, such as DNA-based computing, 
is well advanced. 

Paradoxically, Watson and Crick’s iconic 
structure has also made it possible to recog- 
nize the shortcomings of the central dogma, 
with the discovery of small RNAs that can reg- 
ulate gene expression, and of environmental 
factors that induce heritable epigenetic 
change. No doubt, the concept of the double 
helix will continue to underpin discoveries in 
biology for decades to come. 


Georgina Ferry is a science writer based in 
Oxford, UK. A revised edition of her biography 
Dorothy Crowfoot Hodgkin has just been 
published by Bloomsbury Reader. 
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Detection ofa 


strange particle 


Taku Yamanaka 


In 1947, scientists found a previously unseen particle, which 
is now called a neutral kaon. This work led to the discovery of 
elementary particles known as quarks, and ultimately to the 
establishment of the standard model of particle physics. 


In the late 1940s, the physicists George 
Rochester and Clifford Butler’ observed 
something unusual in their charged-particle 
detector. They were studying the interactions 
between high-energy cosmic rays and a lead 
plate in the detector when they spotted 
V-shaped particle tracks (Fig. 1a). The small 
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gap between the lead plate and the vertex of 
the tracks indicated that an invisible neutral 
particle had been produced in the plate, had 
travelled for a short distance and had then 
decayed into two visible charged particles. 
The mass of the neutral particle was about 
1,000 times that of an electron, implying 


Figure 1| Particle detection that led to a better understanding of fundamental 
physics. a, In 1947, Rochester and Butler’ analysed the particles produced 

when high-energy cosmic rays hit a lead plate (the broad central stripe) ina 
charged-particle detector. In certain photographs, they spotted evidence of a 
previously undetected, invisible neutral particle decaying into two visible charged 
particles, which were identified by tracks (labelled with arrows). b, The discovery 


that it must be a previously unreported type 
of particle. This discovery paved the way for 
many puzzles and surprises in particle physics 
in the decades that followed. 

At the time of Rochester and Butler’s work, 
protons, neutrons, electrons and particles 
called pions (short for m mesons) had been 
identified, and were known to be sufficient 
to form atoms. Pions were proposed? in 1935 
to explain how protons and neutrons are held 
together in small atomic nuclei by the strong 
nuclear force, and were found experimen- 
tally** in 1947. 

While searching for a pion in cosmic rays, 
scientists discovered a different particle’, 
which is now called a muon. A heavy charged 
particle was then found’ in 1944, followed by 
Rochester and Butler’s unstable neutral parti- 
cle. But the discovery of unexpected particles 
did not stop there. Then came the t meson, 
which decays into three pions; the 8 meson, 
decaying into two pions; the k meson, decay- 
ing into a muon and an invisible particle; the 
A° particle, decaying into a protonanda pion; 
and the list goes on. 

In the early 1950s, researchers began 
producing these rare particles in large 
quantities by firing protons at targets in particle 
accelerators. The t, 9and k mesonsand A’ parti- 
cle were peculiar, because, although they were 
generated by the strong force, their decay times 
were much longer than those expected for this 
force. To explain these observations, physicists 
proposeda quantity, known as strangeness (S), 
that is conserved by the strong force’®. 

Protons and neutrons have S=0, and 
through the strong force, can produce a 
pair of strange particles that have S = -1 and 
S=+1, so that total strangeness is conserved. 


b Mesons Baryons 
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However, a strange particle that has S = -1, 
for example, cannot decay into particles that 
have S = 0 through the strong force, because 
strangeness would not be conserved. Instead, 
this decay must occur much more slowly 
through the weak nuclear force, which allows 
total strangeness to change. 

As the accuracy of accelerator-based 
measurements increased, it became clear 
that the tand 6 mesons had extremely similar 
masses and lifetimes. Scientists concluded 
that these mesons must be the same parti- 
cle, which is able to decay into two or three 
pions. The mess of strange mesons was finally 
cleaned up into four particles dubbed kaons 
(short for K mesons): K* and K° and their anti- 
particles K andK°. 

However, accepting that the t and 8 mesons 
were the same particle raised another prob- 
lem. A state of two pions has even parity, which 
means that its wavefunction does not change 
sign under a parity transformation (in which 
spatial coordinates are flipped). By contrast, 
a state of three pions has odd parity. If the 
same particle could decay into two or three 
pions, did that mean that, contrary to all con- 
ventional wisdom, parity is not conserved by 
the weak force? This question, known as the 
1-8 puzzle, led to the discovery, in 1957, of 
such parity-symmetry breaking in cobalt-60 
decays’ and in pion decays”. 

Aconsequence of parity-symmetry breaking 
by the weak force is that elementary particles 
called neutrinos can be only left-handed, 
which means that their motion and intrinsic 
angular momentum are in opposite directions. 
Under a parity transformation, a left-handed 
neutrino becomes a right-handed neutrino, 
which does not exist. However, if one then 
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of many more particles following Rochester and Butler’s work led to a model”? 

in which all of the known mesons and baryons (two classes of particle) consist of 
elementary particles called up (u), down (d) and strange (s) quarks, along with their 
antiparticles (denoted by overbars). The n, n’ and 1° mesons comprise mixtures 

of quark pairs. The mesons and baryons are arranged by their strangeness 

(a quantity that is related to the presence of strange quarks) and electric charge. 


applies a charge-conjugation transformation 
(in which particles are replaced by their 
antiparticles), the right-handed neutrino 
becomes a right-handed antineutrino, which 
does exist. The weak force therefore seemed 
to conserve CP symmetry (symmetry under 
a combined charge-conjugation and parity 
transformation), until such symmetry was 
found to be broken in neutral-kaon decays. 

A neutral kaon is a mixture of K° and K° 
states, and can exist as the CP-even state 
Keven or the CP-odd state K,q4. The lifetime of 
K.aais much longer than that of K,,..,,S0 these 
particles were named K, (for ‘K-long’) and K, 
(for ‘K-short’), respectively. A useful conse- 
quence of such lifetimes is that, if neutral kaons 
are produced by firing protons at a target, the 
CP-even K, component quickly decays, leaving 
only the CP-odd K, component. In 1964, such 
K, particles were observed" to decay into the 
CP-even state of two oppositely charged pions 
('tt). Therefore, despite expectations, CP 
symmetry was shown to be broken. 

In that same year, physicists proposed a 
model"? to explain all of the known mesons 
and baryons — a family that includes protons, 
neutrons and the A° particle. In the model, 
these mesons and baryons consist of elemen- 
tary particles known as quarks, which come 
inthree types: up, down and strange (Fig. 1b). 

In 1973, a theoretical model showed 
that the breaking of CP symmetry could be 
explained by introducing three more quarks: 
charm, top and bottom. In this framework, K, 
can have a small component of K,,., that can 
decay into the CP-even m1 state. But unlike 
other theoretical models, this framework also 
allows K,aq to decay into the CP-even state 
(direct CP violation). 
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Many generations of experiments then were 
carried out to see whether direct CP violation 
exists. The measurement required extremely 
high precision, and after many improvements 
over 25 years, direct CP violation was finally 
confirmed’. Together with the observation 
of CP-symmetry breaking in B mesons (mesons 
that contain a bottom quark)"”"®, the theoreti- 
cal model was confirmed, and helped to estab- 
lish the standard model of particle physics, 
which is the current explanation of the Uni- 
verse’s particles and forces. 

However, the standard model is not 
complete. For instance, it cannot explain why 
the Universe contains so little antimatter, nor 
what the mysterious substance called dark 
matter is. Researchers are therefore trying to 
search for a hint of particle physics beyond 
that of the standard model. For example, 
experiments in Japan” and Europe” are using 
extremely rare kaon decays to search for such 
a hint. 

In retrospect, Rochester and Butler’s 
V-shaped particle tracks are thought to have 
been caused by a Kg, produced in the lead plate, 
decaying intothe mt state. Since their work, 
kaons have been used to discover strangeness 
and the breaking of parity and CP symmetries, 
to build the quark model and the standard 
model, and now to search for previously 
unseen particle physics. Could Rochester 
and Butler have ever imagined that they had 
opened such atreasure chest? 
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Neuroscience 


Neuronal signals 
thoroughly recorded 


Alexander D. Reyes 


Originally developed to record currents of ions flowing 
through channel proteins inthe membranes of cells, the 
patch-clamp technique has become atrue stalwart of the 


neuroscience toolbox. 


Information in the brain is thought to be 
encoded as complex patterns of electrical 
impulses generated by thousands of neuronal 
cells. Eachimpulse, known as an action poten- 
tial, is mediated by currents of charged ions 
flowing throughaneuron’s membrane. But how 
theions pass through the insulated membrane 
of the neuron remained a puzzle for many 
years. In 1976, Erwin Neher and Bert Sakmann 
developed the patch-clamp technique, which 
showed definitively that currents result from 
the opening of many channel proteins in the 
membrane’. Although the technique was 
originally designed to record tiny currents, it 
has since become one of the most important 
tools in neuroscience for studying electrical 
signals — from those at the molecular scale to 
the level of networks of neurons. 

By the 1970s, current flowing through the 
cell was generally accepted to result from 
the opening ofmany channelsinthemembrane, 
although the underlying mechanism was 
unknown. At that time, current was commonly 
recorded by impaling tissue witha sharp elec- 
trode — a pipette witha very fine point. Unfor- 
tunately, however, the signal recorded in this 
way was excessively noisy, and so onlythelarge, 
‘macroscopic’ current — the collective current 
mediated by many different types of channel — 
that flows through the tissue could be resolved. 

In 1972, Bernard Katz and Ricardo Miledi’, 
pioneers of the biology of the synaptic connec- 
tions between cells, managed to infer fromthe 
macroscopic current certain properties of the 
membrane channels, but only after a heroic 
effort to exclude all possible confounding fac- 
tors. The problem was that the macroscopic 
current could be influenced by factors not 
directly related to channel activity, such as 
cell geometry and modulatory processes that 
regulate cell excitability. Also troublesome was 
that interpretations of macroscopic-current 
features were based on unverified assumptions 
about the statistics of individual channel activ- 
ity’?. Despite Katzand Miledi’s carefulanalyses, 
there wasalingering doubt about whether their 
conclusions were correct. The crucial data 
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were obtained by Neher and Sakmann using 
patch clamp. 

The patch-clamp technique is conceptually 
rather simple. Instead of impaling the cells, 
a pipette with a relatively large diameter is 
pressed against the cell membrane. Under the 
right conditions, the pipette tip ‘bonds’ with 
the membrane, forming a tight seal. This sub- 
stantially reduces the noise compared with that 
encountered using sharp electrodes, because 
the small patch of membrane encompassed 
by the pipette tip is electrically isolated from 
the rest of the cell’s membrane and from the 
environment surrounding the cell (Fig. 1). 

The tiny currents passing through the few 
channels in the patch were thus observed for 
the first time. The recording confirmed key 
channel properties: when channels open, 
there is a step-like jump in the current trace 
and, when they close, a step-like drop back 
to baseline. It was now possible to determine 
details such as the statistics of the opening and 
closing of channels, the amplitude of the cur- 
rents they mediate and the optimal stimulithat 
trigger their opening. For this work, Neher and 
Sakmann were awarded the 1991 Nobel Prize in 
Physiology or Medicine. 

Improvements in patch clamp made it 
feasible to study channels in various prepa- 
rations‘ to finally address long-standing 
questions. There was particular interest in ver- 
ifying a model for action-potential generation® 
proposed by Nobel laureates Alan Hodgkin and 
Andrew Huxley in the 1950s. Specific predic- 
tions of the model could nowbe tested directly 
by examining the current through individual 
channels and by observing the changes in cur- 
rent that occur when the molecular structure 
of the channel is modified®. Ultimately, the 
model was shown to be mostly correct and 
remains the gold standard for computational 
neuroscientists today. 

One of the several variants of patch clamp 
—the whole-cell configuration — found an audi- 
ence with neuroscientists studying electrical 
phenomena in neurons beyond the channel 
level. To achieve whole-cell recording, the 
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Figure 1| The patch-clamp technique used at different scales. a, Neher and 
Sakmann’ developed the cell-attached patch-clamp technique. An electrode 
(a fine pipette) is pressed against a ‘patch’ of the cell membrane so that ion 
currents (red dotted arrow) passing through channel proteins in the patch 
under the electrode can be recorded. In the whole-cell configuration, the 
patch is ruptured so that the whole-cell macroscopic current (blue dotted 


patch of membrane under the electrode is 
ruptured, enabling electrical access to the cell. 
Compared with the use of sharp electrodes, 
whole-cell patchclampallowsmuchmoreaccu- 
rate recordings and, crucially, is less damaging 
to the cell. This allowed systematic investiga- 
tion of synergistic processes at the cellular 
level, such as the regulation of macroscopic 
currents by modulatory molecules, and inter- 
actions between the different types of channel 
inthe neuron. 

The relatively large opening created in the 
cell in the whole-cell configuration also pro- 
vided access to the cell by chemicals, enabling 
dyes to be delivered for visualizing intricate 
cell structures, and RNA to be extracted for 
gene-expression analysis’. Neher’s group 
examined the sequence of events that underlie 
the transfer of information between cells by 
introducing chemicals into the cell and simul- 
taneously tracking the resulting changes inthe 
electrical properties of the cell’s membrane’. 

Whole-cell patch clamp proved ideal for 
studying the collective properties of neu- 
rons and neuronal networks in brain slices 
maintained in vitro. A challenge in working 
with more-complex systems such as neu- 
ronal networks is that the number of possible 
confounding factors increases. Sakmann’s 
solution inthe 1990s was to carry out simulta- 
neous whole-cell recording using two or three 
electrodes, which to some seemed excessive 
because comparable data could be obtained by 
sequential recordings using fewer electrodes. 
However, the rationale was that taking time to 
design the near-perfect experiment mitigated 
later difficulties in data interpretation 
analogous to those faced by Katz and Miledi. 

Hence, simultaneous recordings from 
different parts of the neuron definitively con- 
firmed that action potentials are initiated at 
one part of the main long neuronal protrusion 


b Cellular 


Cc Networks 


(the axon) and propagate back tothe dendrites 
(clustered protrusionsthat receiveinputsfrom 
other neurons)’. The mechanisms that under- 
lie signalling between neurons were directly 
investigated by placing electrodes on either 
side of a synaptic connection”®. Moreover, 
triple recordings from neurons of different 
classes uncovered certain basic principles of 
network organization”. 

The patch-clamp technique is also used to 
examine cell activities under more natural 
conditions. To study how sensory stimuli 
and movements are represented in the brain, 
experiments must be carried out in living 
animals. The challenge with this approach, 
however, is that the slightest movement can 
dislodge an electrode from the neuron. Whole- 
cell patch-clamping turns out to be remarka- 
bly stable because of the tight seal between 


“Patch-clamp recording 
is arguably still the most 
effective way of studying 
electrical signals inthe brain.’ 


the electrode and the membrane. Thus, this 
technique has permitted recording from den- 
drites” and pairs of neurons” in anaesthetized 
rodents, and even from animals that are able 
to walk and run”. 

Patch-clamp recording is arguably still 
the most direct and effective way of study- 
ing electrical signals in the brain. The data 
obtained with this technique essentially rep- 
resent the ground truth for investigators in 
many branches of neuroscience, from theo- 
rists® to translational researchers develop- 
ing drugs for the treatment of certain brain 
conditions, including epilepsy” and autism 
spectrum disorder”. 
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arrow), which represents the summed currents from the entire cell, can be 
recorded. b, Simultaneous whole-cell recordings from different parts of 
aneuroncan determine, for example, the direction of travelling signals. 

c, Whole-cell recordings can be made from a small network of connected 
neurons. d, Whole-cell recording can even be made in the brains of animals 
performing a task or walking around freely. 


Moreover, patch clamp complements 
modern ‘optogenetic’ techniques, which 
enable control and visualization of the activ- 
ities of large populations of neurons using 
light'®. Emerging technologies, such as pros- 
theses for vision”, will probably rely heavily 
on patch-clamp recording to establish the 
optimal conditions for converting external 
stimuli into electrical signals. Patch-clamping 
will clearly remain a vital tool for the neurosci- 
entist in the foreseeable future. 
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Materials science 


Birth of aclass 


of nanomaterial 


Ryong Ryoo 


Nearly 30 years ago, asimple chemical principle was reported 
that enabled the synthesis of a plethora of porous materials 

— some of which might enable applications ranging from 
biomedicine to petrochemical processing. 


In1992, Kresge et al.'reported a breakthrough 
in materials science. They described multi- 
molecular templates that guide the assembly 
of ordered mesoporous molecular sieves — 
materials that contain uniform, regularly 
arranged pores with mesoscopic diameters 
(between 2 and 50 nanometres). Their find- 
ings triggered an explosion of research into 
mesoporous materials, which have since 
been intensively studied for applications as 
diverse as catalysis, molecular adsorption, 
drug delivery and molecular separations using 
membranes. 

When Kresge and colleagues published 
their work, materials known as zeolites — 
crystalline aluminosilicate compounds 
with uniform pores usually less than 2 nm in 
diameter — had long been used as catalysts 
in petroleum refining and for molecular sep- 
arations. However, large molecules, such as 
those found in the heavy fractions of crude 
oil, were unable to diffuse through the small 
pores of these molecular sieves, and so could 
not be processed efficiently. There had 
been many attempts to obtain zeolite-like 
materials with enlarged pores and ordered 
structures, but the large-pore materials com- 
monly available at that time all had a broad 


distribution of pore diameters, making them 
unsuitable for many applications. 

In 1990, it was reported? that the spaces 
between the layers of a silicate material called 
kanemite could be expanded by adding organic 
molecules containing long hydrocarbonchains 
to asuspension of kanemite powder in water. 
This process could generate poresofupto4 nm 
in diameter, but worked only with kanemite. 

Enter Kresge and colleagues, whose method 
for making mesoporous materials began 
with the formation of layers of silica, a few 
nanometres thick, in between the surfaces of 
cylindrical supramolecular assemblies called 
micelles (Fig. 1). The micelles consisted of 
numerous detergent-like molecules, known 
as surfactants, that were packed together to 
forma liquid-crystal structure reminiscent of 
a honeycomb. Once silica layers had formed 
between the micelles, the researchers heated 
the resulting material in air to remove the sur- 
factant, thereby producing a silica product 
that retained ahoneycomb-like array of nano- 
metre-scale pores. The researchers named 
their material Mobil Composition of Matter 
No. 41 (MCM-41), after the oil company that 
they worked for. 

The most impressive feature of Kresge and 


colleagues’ strategy was that the diameter, 
shape and connectivity of the pores could, 
in principle, be controlled by manipulating 
the structure and size of the surfactant mol- 
ecules. The authors demonstrated only a few 
examples of this: they showed that the pore 
diameter could be controlled within a narrow 
range of about 3-10 nm. Nevertheless, their 
approach was later shown to be applicable to 
the full range of mesopore sizes’. 

Researchers in the field initially regarded 
Kresge and colleagues’ work as simply extend- 
ing the pore sizes of the existing family of 
molecular sieves. However, it soon became 
apparent that the surfactant-based strategy 
could be used to synthesize many types of 
ordered mesoporous material, including ones 
made from metal oxides*, organic polymers® 
and even transition metals®. Having the abil- 
ity to make a variety of mesoporous materials 
that contain highly ordered arrangements of 
pores opened up many avenues of research for 
nanoscience. 

Akey development in1998 was the use of pol- 
ymeric surfactants’, which increased the size 
of mesopores that could be made to 30 nm. 
Polymeric surfactants used in the synthesis 
of such large-pored materials, as well as other 
organic surfactants (including the one used 
to make MCM-41) are now classified as soft 
templates, whichreflectsthe somewhat deform- 
able nature of the micelles that act as the mould. 
Anadvantage of using soft templatesis that mes- 
oporous materials canbe madein solutionatrel- 
atively lowtemperatures. Moreover, the porous 
structure of the resulting material can easily be 
controlled by making simple modifications to 
the template molecules. 

Another breakthrough, reported in 1999, was 
the discovery of hard templating (also known 
as nanocasting)®”. In this process, mesoporous 
materials are fabricated from precursor mole- 
cules using another solid mesoporous material 
asamould,inamanner analogous tothe casting 
of concrete pipes or bricks. Nanocasting has 
two somewhat cumbersome requirements: 


Figure 1| Synthesis of the porous solid MCM-41. In 1992, Kresge et al. reported 
the use of cylindrical molecular aggregates, called micelles, as templates for 

the synthesis of porous materials. a, In the first step, they formed a silicate 

layer between micelles stacked in a hexagonal array; the individual molecules 

in the micelles are shown as blue spheres with ‘tails’ attached. The authors then 
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destroyed the micelles using heat, thereby producing the porous silicate MCM-41. 
b, This micrograph! of MCM-41 reveals its uniform, honeycomb-like porous 
structure. Kresge and colleagues’ template-based strategy has since enabled the 
synthesis of a wide range of potentially useful materials that contain ordered 
pores of 2-50 nanometres in diameter (mesopores). Scale bar, 100 angstréms. 


the precursors must infiltrate the pores of 
the mould uniformly, without accumulating on 
the external surface; and the precursors must 
convert completely into the desired product. 
The method does, however, work particularly 
well when high temperatures (of the order 
of 500 °C or more) are needed to synthesize 
a mesoporous material. This contrasts with 
the use of surfactant-based soft templates, 
which typically decompose at temperatures 
above 200 °C. 

Nanocasting was first used to make ordered 
mesoporous carbon§, but has since been devel- 
oped as a general approach for synthesizing 
nanowires and nanoporous materials of var- 
ious compositions, including metal oxides, 
organic polymers and metals’. Mesoporous 
carbons have garnered much interest because 
of their high electrical conductivity", and 
because they can accommodate alarge volume 
of guest atoms, molecules or particles inside 
the mesopores. For this reason, mesoporous 
carbons are considered to be particularly 
attractive candidates for electrode materials 
in chemical sensors”, supercapacitors” and 
high-performance batteries”. 

Mesoporous materials are also gaining 
attention for biomedical applications such as 
drug or gene delivery”. Mesoporous silicas, in 
particular, can be synthesized in various shapes 
and sizes, are often biocompatible and sponta- 
neously degrade in human tissues — a property 
that could be used to release drugs trappedin 
the silica. Moreover, the ability to accurately 
control the diameters of mesopores in silica is 
expected to provide tremendous advantages 
in biomedical applications, because the pore 
sizes directly affect the loading and release 
kinetics of drugs in delivery systems. 

The main uses envisaged for mesoporous 
materials include as adsorbents in industrial 
processes for separating chemicals, and as 
catalysts in petrochemical refinery processes. 
Indeed, the original motivation for Kresge and 
colleagues’ MCM-41 research was to synthesize 
catalytic materials for petroleum refining”. But 
although MCM-41 had sufficiently large pores 
for this purpose, its glass-like amorphous 
framework showed poor catalytic activity. 

Ever since, enormous efforts have been 
made tosynthesize mesoporous materials that 
contain crystalline, microporous, zeolite-like 
frameworks, which exhibit high catalytic 
performance. A breakthrough was made 
ten years ago, with the report of a specially 
designed surfactant molecule that enables 
the synthesis of such materials*’. The cata- 
lytic properties of the resulting mesoporous 
zeolites have not been fully explored for indus- 
trial processes, because the required surfactant 
is costly and not yet commercially available. 
However, I expect that mesoporous zeolites 
will trigger the next explosion of researchin 
this field, by opening up many opportunities 
for catalytic applications. 
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Evolutionary insights from 
Australopithecus 


Dean Falk 


In 1925, aNature paper reported an African fossil of a previously 
unknown genus called Australopithecus. This finding 
revolutionized ideas about early human evolution after human 
ancestors and apes split on the evolutionary tree. 


Australian-born Raymond Dart had barely 
started his job as chair of the anatomy depart- 
ment of the University of the Witwatersrand 
inJohannesburg, South Africa, when he made 
a momentous discovery. Using his wife’s 
knitting needles, he painstakingly extracted 
a fossil (Fig. 1) from a chunk of rock found in 
Taungs (now known as Taung), South Africa. 
As he recalled!, “the rock parted ... What 
emerged was a baby’s face, an infant with a 
full set of milk teeth ... | doubt if there was 
any parent prouder of his offspring than I 
was of my ‘Taungs baby’ on that Christmas of 
1924.” Better yet, the fossil fitted neatly with 
another type of fossil, called an endocast, 
formed from sediments accumulated inside 
the skull. The endocast reflects brain-surface 
details stamped on the braincase’s inner walls. 
These fossils revealed a combination of ape- 
like and human-like features never previously 
reported together. 

Convinced that the specimen, called the 
Taung Child, represented an extinct link 
between humans and our ape ancestors, Dart 
dispatched a report? to Nature by mail boat. 
He probably felt some trepidation because 
several fellows of the Royal Society inLondon, 
who had mentored and taught with him, con- 
sidered the human forerunner to be the Brit- 
ish specimen known as Piltdown Man (which 
was later exposed as a hoax). Piltdown Man’s 
human-sized brain and ape-like jaw contrasted 
with the Taung Child’s ape-sized brain and 
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human-like jaw and teeth. In Dart’s view, the 
Taung Child looked more primitive and older 
than the main existing candidates for the ear- 
liest ancestral human relative — Piltdown Man 
andJava Man (Homoerectus) from Indonesia. 
Dart therefore described the Taung Child as 
a‘man-ape’ rather than an ‘ape-man’, like Java 
Man, and named the species Australopithecus 
africanus, which means southern ape from 
Africa. 

Dart declared that humankind’s cradle was 
not in Indonesia or Britain as his contempo- 
raries thought, but was instead in Africa, as 
Charles Darwin had previously suggested?. 
The comfortable habitats favoured by African 
chimpanzees and gorillas in Dart’s time were 
more than 3,200 kilometres north of where the 
Taung Child dwelled, and Dart suggested in his 
1925 Nature paper that intense competition for 
limited resources in harsh southern African 
landscapes “furnished alaboratory suchas was 
essential to this penultimate phase of human 
evolution”. In the paper, he also reasoned 
that “enhanced cerebral powers possessed by 
this group ... made their existence possible 
in this untoward environment”, attributing 
intelligence based on his interpretation of 
human-like brain convolutions at the back of 
the specimen’s endocast. 

When the paper appeared, the Taung Child 
and 32-year-old Dart became world famous 
overnight. Yet not everyone was receptive to 
new ideas about human evolution. Indeed, 
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Figure 1| Raymond Dart in 1925 holding the Australopithecus africanus fossil called the Taung Child. 


five months later, a court case known as the 
Scopes monkey trial began in the United States 
to settle whether evolution could be taughtin 
Tennessee schools. The immediate reaction to 
Dart’s paper was mainly enthusiastic, but he 
soon became a target of ‘you’ll-burn-in-hell’ 
letters from religious fundamentalists, and his 
former London colleagues published harsh 
criticisms of his research. Dart’s main cham- 
pion, the physician Robert Broom, remarked': 
“It makes one rub one’s eyes. Here was aman 
who had made one of the greatest discoveries 
in the world’s history — a discovery that may 
yet rank in importance with Darwin’s Origin 
of Species; and English culture treats him as if 
he had been a naughty schoolboy.” 

To answer his critics, Dart spent four years 
preparing a book® about the Taung Child. It 
provided voluminous extra details about the 
endocast, bones and teeth, and bolstered the 
argument that humans originated in Africa®. 
He submitted the book to the Royal Society, 
which declined to publish it. The pro-Piltdown 
fellows were probably behind this rejection’. 
Sadly, the book remains unpublished. 

The most controversial aspect of Dart’s 
paper, then and now, is his view that the back 
of the Taung Child’s endocast is human-like. 
Some have argued that Dart misidentified a 
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skull imprint as a brain groove similar to a 
human one, a feature that is inconsistent with 
the Taung Child’s otherwise ape-like brain’. 
Dart’s 1925 Nature paper describes two endo- 
cast brain grooves, but his book identifies 
14 further grooves, and describes 3 dispersed 
brain regions that look expanded in compari- 
son with those of ape brains. If these findings 
had been published, they might have influ- 
enced the still-controversial debate about 
whether the human brain evolved in a piece- 
meal, mosaic fashion or in a more globally 
connected manner. Some mosaicists still cite 
Dart’s 1925 Nature paper, but his unpublished 
book reveals his globalist viewpoint. 

Dart’s paper stated: “we may confidently 
anticipate many complementary discover- 
ies concerning this period in our evolution.” 
Indeed, thousands of specimens have been 
found that represent various Australopithecus 
species that lived in Africa during different 
time spans from more than 4 millionto around 
1 million years ago. The fossil Lucy is an exam- 
pleofonesuchspecies, called Australopithecus 
afarensis. 

Subsequent work confirmed that Dart got 
most of the details right regarding his discov- 
ery. Australopithecus shared features of both 
living apes and humans, and they were bipedal 
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as he surmised because the skull opening that 
accommodates the spinal cord is positioned 
centrally at the base of the specimen’s cra- 
nium. Dart correctly inferred? that hominins 
originated in Africa, and that our genus Homo 
arose from Australopithecus. Happily, he lived 
long enough to see his initially iconoclastic 
ideas become widely accepted. 

I cannot help but wonder what Dart would 
have thought about another notable discov- 
ery reported in Nature’® — the 2004 identifi- 
cation of a species called Homo floresiensis 
(the most complete specimen is nicknamed 
the Hobbit) from remains in Indonesia dating 
to approximately 100,000-60,000 years 
ago. Like the Taung Child, the H. floresiensis 
specimens showed acombination of features 
never previously found in a fossil specimen. 
Homo floresiensis had ape-like, Australo- 
pithecus-like and human-like traits, as well as 
atiny brain, leading some to suggest that this 
species might bea lineage descended froma 
previously unknown early hominin migration 
out of Africa”. 

The parallels with Dart’s discovery are 
remarkable. Homo floresiensis drew world- 
wide attention, but was also met with scorn 
from some scientists (who argued that the 
Hobbit represents an abnormal human). 
Homo floresiensis-like fossils dating to 
700,000 years ago have since been reported”, 
and its legitimacy as a species is gaining 
traction. It might be equally crucial for unrav- 
elling the evolution of early members of the 
human family tree outside Africa in the way 
that the Taung Child was essential for under- 
standing the evolution of human ancestors 
in Africa. Only time will tell. One thing is cer- 
tain, however; the more palaeoanthropology 
changes, the more palaeopolitics stays 
the same. 
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Astronomy 


First exoplanet found 
around a Sun-like star 


Eliza Kempton 


In 1995, astronomers detected a blisteringly hot Jupiter-mass 
planet orbiting closer to its host star than Mercury is to the Sun. 
This discovery recast our thinking of how planets form and led 
toanewera of exoplanetary exploration. 


Anyone over the age of 35 will remember grow- 
ing up ina world in which only one planetary 
system was known — our own. We remember 
proudly reciting the names of the nine planets 
(eight before Pluto’s discoveryin1930, and again 
todaywithits reclassification as a dwarf planetin 
2006) and wondering what other planets might 
exist around the stars inthe night sky. Contem- 
plating life beyond the Solar System was rele- 
gated toscience fiction. This all changedin1995 
when Mayor and Queloz' reported the detection 
of the first exoplanet around a Sun-like star. 

The discovery of the gas-giant planet — 
named 51 Pegasib after its parent star, 51 Pegasi 
— came asa surprise. Gas-giant planets, such 
as Jupiter, are located in the outer parts of 
the Solar System. The prevailing theory was, 
and stillis, that the formation of these planets 
requires icy building blocks that are available 
only in cold regions far away from stars. Yet 
Mayor and Queloz found 51 Pegasi b to be 
orbiting about ten times closer to its host star 
than Mercury is to the Sun (Fig. 1). One possible 
explanation is that the planet formed farther 
out and then migrated to its current location. 

The gas-giant planet was not the first 
exoplanet to be discovered. However, the previ- 
ous detections”? were of even stranger objects 
orbiting pulsars — rapidly spinning neutron 
stars, which are the collapsed remnants of hot 
massive stars. The discovery of 51 Pegasi b was 
the first to substantiate the existence of planets 
around long-lived hydrogen-burning stars that 
resemble the Sun. 

The bizarre character of a gas-giant planet 
orbiting so close to its parent star engendered 
considerable scepticism about the true nature 
of 51 Pegasi b. Mayor and Queloz detected the 
planet through minute back-and-forth motion 
of 51 Pegasi, which seemed to indicate that a 
planet-mass object was pulling on the star. 
But this stellar motion, sensed by frequency 
shifts in the spectra of light from 51 Pegasi, 
had other possible interpretations. A lively 
debate ensued in the literature about whether 
pulsations of the star might be masquerading 
as a planetary signature*. 


This debate was put to rest in 1998 when 
the astronomer David F. Gray wrote a paper 
refuting his previous assertion that the stellar 
spectra were indicative of pulsations rather 
than a planet®. Further vindication came 
through the detection of planets similar to 
51 Pegasib, as other researchers combed their 
existing data for similarly unexpected planet- 
ary signals’. These highly irradiated giant 
planets have come to be knownas hot Jupiters. 

In the 24 years since the discovery of 
51 Pegasi b, about 4,000 exoplanets have 
been identified (see go.nature.com/2jpcgtf). 
Other detection techniques have entered the 
scene, including the transit method, in which 
an exoplanet is revealed through the subtle 
dimming of its host star as the planet crosses 
the line of sight between Earth and the star. Hot 
Jupiters have continued to be discovered by 
the many exoplanet searches that are sensitive 
to large planets on close orbits. However, it is 
now known that such objects are intrinsically 
rare, orbiting only about 1% of Sun-like stars®. 
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By contrast, planets knownas super-Earths 
and mini-Neptunes abound. Such objects, 
which inhabit the size and mass gap between 
the rocky and gas-giant planets of the Solar Sys- 
tem, were also a surprise to planet hunters, but 
seem tobecommonplace in our Galaxy. There 
isnow good reason to think that the Milky Way 
contains more planets than it does stars’. 

Mayor and Queloz’s detection of 51 Pegasib 
gave rise to anew field of astronomy. The ranks 
of exoplanet researchers have been steadily 
growing, by some counts now making up about 
one-quarter of the astronomy profession (see 
go.nature.com/32imc4j). Incipient subfields 
include the study of exoplanet demograph- 
ics and the characterization of exoplanetary 
atmospheres. 

This characterization has confirmed that 
hot Jupiters truly are gas-giant planets, but 
ones representing what our own Jupiter 
would look like if it were suddenly trans- 
ported 100 times closer to the Sun. Amid the 
scorching-hot hydrogen-helium envelopes 
of these planets, astronomers have detected 
trace amounts of steam, carbon monoxide and 
metal vapours’? ’. Such atmospheric studies 
could lead to the eventual characterization of 
exoplanets that resemble Earth. 

The future of the exoplanet field is bright. 
In April 2018, NASA launched the Transiting 
Exoplanet Survey Satellite (TESS), aspace tele- 
scope that is just beginning to fulfilits mission 
of finding small transiting planets around the 
brightest stars in the night sky. These planets 
will be ideally suited for follow-up using 
NASA’s James Webb Space Telescope (JWST), 
once it launches, to measure their atmospheric 
properties and compositions. Following on 
the heels of JWST, the European Space Agency 
has selected the Atmospheric Remote-sensing 
Infrared Exoplanet Large-survey (ARIEL) space 
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Figure 1| The planetary systems of the Sun and of 51 Pegasi. a, In the Solar System, gas-giant planets, such 
as Jupiter, orbit far from the Sun. In 1995, Mayor and Queloz' reported the discovery of 51 Pegasi b — a gas- 

giant planet that is much closer to its host star, 51 Pegasi, than Mercury is to the Sun. The orbital distances of 
the planets are given in astronomical units (1 AU is the average separation between Earth and the Sun). b, The 


sizes of all objects are shown approximately to scale. 
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telescope to launch in 2028. ARIEL will be 
dedicated to characterizing the atmospheres 
of awide sample of exoplanets. 

These programmes are paving the way 
towards the ultimate goal of potentially 
detecting the signatures of life on an exo- 
planet. This goal could most optimistically 
be achievable in the next decade, but more 
realistically will require a new generation of 
space- and ground-based telescopes”. What 
is remarkable is that humans have gone from 
discovering the first exoplanets to legitimately 
plotting out the search for life on these worlds 
injust a quarter of acentury. 
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Cellidentity 


reprogrammed 


Samantha A. Morris 


The discovery that cell differentiation can be reversed 
challenged theories of how cell identity is determined, laying 
the foundations for modern methods of reprogramming cell 
identity and promising new regenerative therapies. 


Allcells of an organism derive from a single 
cell. As development progresses, cells become 
increasingly specialized to perform defined 
functions, a commitment that is accompanied 
by arestriction in the range of potential fates 
of those cells. In the late nineteenth century, 
a predominant thought was that, when they 
differentiate, cells retain only those pieces of 
heritable information required to maintain 
cell-type identity and function’. This led to 
the theory that differentiation is an irrevers- 
ible process (Fig.1a). John Gurdon’s seminal 
paper in Nature on nuclear reprogramming 
of cellidentity, with Tom Elsdale and Michael 
Fischberg’, provided a remarkable challenge 
tothis dogma, and formed the basis for today’s 
cell-reprogramming field. 

Gurdon and colleagues’ 1958 paper was 
preceded by the work of Robert Briggs and 
Thomas King®. To investigate the developmen- 
tal potential of differentiating cells, Briggs and 
King used a method called nuclear transfer, 
in which the nucleus is removed from one 
cell (in this case, an egg) and replaced with 
an intact nucleus froma different cell. Briggs 
and King’s experiments were a technical feat 
that had previously been accomplished only 
in single-celled organisms’. 

Using this method in the more-complex 
Northern leopard frog (Rana pipiens), they 
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were able to produce normal, swimming 
tadpoles by replacing egg-cell nuclei with 
nuclei from blastomeres — cells that are 
made through the splitting of a fertilized egg 
cell during early development®. However, 
the transfer of nuclei from R. pipiens cells at 
more-advanced stages of differentiation — 
from when the hollow ball of blastomeres 
differentiates into a multilayered structure 
called a gastrula, onwards — did not support 
the development of normal frogs? (Fig. 1b). 


“Since this paper appeared, 
biologists have developed 
the ability to reprogram cell 
identity by several routes.” 


Thus, Briggs and King’s results demonstrated 
that the nucleiin blastomeres are not irrevers- 
ibly changed with differentiation. However, 
they also indicated that, as development pro- 
gresses, the potential of transplanted nuclei 
to support normal development decreases — 
suggesting that cell differentiation might be 
irreversible and might involve irreversible 
genetic changes. Thus, Briggs and King con- 
cluded? that the nuclei of cells in the late- 
stage gastrula have an “intrinsic restriction 
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in potentiality for differentiation’. 

In 1958, Gurdon, Elsdale and Fischberg 
addressed the questions surrounding the 
potential of differentiated cells using a differ- 
ent species of frog, Xenopus laevis (the African 
clawed frog). In contrast to the Rana species, 
whose availability is seasonally restricted, 
X. laevis is available year round and rapidly 
reaches sexual maturity”. Inthe authors’ exper- 
iments, donor nuclei from cells at various 
developmental stages, from early blastomeres 
to cells from tadpoles just before hatching, 
were transferred into Xenopus egg cells. 

The donor nuclei were derived from a 
mutant stock in which each cell contained only 
one nucleolus (an organelle inside the nucleus) 
instead of the usual two. This approach pro- 
vided a useful visual marker to confirm that the 
resulting animals obtained from nuclear trans- 
fer were indeed derived from the transferred 
nucleus, and not from existing material inthe 
egg. These experiments demonstrated that 
normal tadpoles could be obtained from cells 
at stages of development up to pre-hatching 
tadpole stages (Fig. 1c) — much later than the 
developmental stage of the cells that Briggs 
and King had used. 

Many of the tadpoles that developed from 
cells containing transferred nuclei underwent 
normal metamorphosis into frogs, which 
seemed to be sexually mature. The authors 
noted that the lone frog derived from the 
most-differentiated cell nucleus was “acciden- 
tally killed shortly before metamorphosis”. A 
subsequent report’® was free of such misad- 
venture; it described the derivation of fertile 
adult frogs from the transplanted nuclei of 
fully differentiated cells collected from the 
intestines of feeding tadpoles. 

Gurdon and colleagues thus demonstrated, 
unlike Briggs and King, that differentiated 
nuclei could support successful development. 
Despite this discordance, both groups agreed 
that the advance of a nucleus through differen- 
tiation was accompanied by a reduction in its 
ability to support normal development. On the 
basis of their findings that some differentiated 
nuclei could support normal development 
(albeit with a relatively limited frequency of 
success), Gurdon and colleagues concluded 
that the differentiated cell state is not aresult 
of irreversible genomic changes. Rather, the 
nuclei of differentiated cells retain the capa- 
city to orchestrate the development of a fully 
functioning organism. 

Almost 40 years after these amphibian 
experiments, transfer of the nucleus of an 
adult mammary epithelial cell was used to gen- 
erate a cloned mammal: Dolly the sheep’. The 
first mouse to be cloned using nuclear trans- 
fer from adult cells, Cumulina, was reported 
shortly afterwards’. To prove beyond doubt 
that cloned animals could be produced using 
nuclei from fully differentiated cells (and had 
not previously been derived from contaminant 
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Figure 1| Key milestones in understanding the potential of differentiated cells. a, In 1892, Weismann 
proposed that, as cells ina developing embryo differentiate, they retain only those genes required to maintain 
cell-type identity, rendering differentiation an irreversible process’. b, Studying the Northern leopard frog 
(Rana pipiens), Briggs and King reported? in 1955 that nuclei from differentiated cells that were transferred 
into an egg cell from which the nucleus had been removed (an enucleated egg cell) could not support normal 
development, in line with Weismann’s thinking. c, In their 1958 Nature paper’, Gurdon, Elsdale and Fischberg 
challenged the notion that development is irreversible, reporting that nuclei derived from differentiated cells 
of the African clawed frog (Xenopus laevis) could, in fact, support normal development. d, In 2006, Takahashi 
and Yamanaka” identified a core set of four transcription factors that reset differentiated mouse cells toa 
pluripotent state, capable of giving rise to any cell types in the body. 


stem cells that had broader potential), mice 
were derived using the nuclei of mature B cells 
and T cells’. During maturation, the genomes 
of both of these types of immune cell undergo 
DNA rearrangements, which were detected in 
the clones. 

Together, this rich history of nuclear trans- 
fer revealed that cell differentiation can be 
reversed, resetting cell identity to the earli- 
est embryonic stages. This pioneering work 
formed the foundations for the reprogramming 
field, which has the core goal of manipulating 
cell identity to produce any desired cell type. 

Inthe 1980s, early work in reprogramming 
revealed that it is possible not only to reset 
cell identity to the blank slate of early embry- 
onic development, but also to switch a cell’s 
identity altogether. For example, one study” 
showed that fusion of amouse muscle cell with 
a human amniocyte (a fetal cell that floats in 
the amniotic fluid) to produce acell with both 
a human anda mouse nucleus resulted in the 
rapid expression of human muscle-specific 
genes. This showed that factors producedina 
differentiated cell (inthis case, the mouse mus- 
cle cell) can induce the expression of genes 
that are repressed in another type of differen- 
tiated cell (inthis case, the human amniocyte). 
Together with the nuclear-transfer studies, 
these pivotal experiments established that fac- 
tors produced in egg cells and differentiated 


cells are able to direct cell fate by regulating 
gene expression. 

A key moment came in 1987, when a single 
factor capable of reprogramming cell iden- 
tity was identified; the expression of a pro- 
tein called MyoD (a transcription factor) was 
shown to convert fibroblast cells into con- 
tracting muscle cells". Gurdon was some- 
what pessimistic about the prospect that cell 
reprogramming could be quickly achieved 
using a defined set of factors, stating in 2006, 
“Looking far ahead, it may become possible to 
convert cells of an adult to anembryonic state 
without needing to use eggs”””. However, just 
a few months later, Kazutoshi Takahashi and 
Shinya Yamanaka reported that differentiated 
cells could be reset toa pluripotent state — that 
is, astate in which they could differentiate into 
multiple types of cell — through the expression 
of only four transcription factors” (Fig. 1d).In 
2012, Gurdon and Yamanaka were awarded 
the Nobel Prize in Physiology or Medicine for 
their work. 

Since Gurdon and colleagues’ paper 
demonstrating that developmental poten- 
tial can be reinstated in differentiated cells, 
cell biologists have developed the ability to 
reprogram cellidentity by several routes. For 
example, we can use transcription-factor-me- 
diated reprogramming to return cells to an 
embryonic state” and subsequently direct 
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their differentiation to desired identities by 
mimicking normal developmental processes. 
Alternatively, embryonic states can be alto- 
gether avoided by expressing specific factors 
to directly convert a differentiated celltypeto 
another cell identity ”®. Such strategies offer 
the potential to produce patient-derived cells 
for modelling diseases in vitro”’. 

Moreover, cell reprogramming forms 
the basis of various proposed regenerative 
therapies, including the generation of cells 
that line the retina at the back of the eye to 
treat a disorder called age-related macular 
degeneration’’, a major cause of vision loss. 

Gurdon and colleagues’ 1950s conclusions 
that the developmental clock can be reset chal- 
lenged the long-standing theory at that time 
that cell differentiation is an irreversible pro- 
cess. Their work now represents a cornerstone 
of current reprogramming technologies that 
aim to deliver a range of cell types for disease 
modelling and regenerative therapies. 
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10 extraordinary papers 


Atmospheric science 


The discovery of the 
Antarctic ozone hole 


Susan Solomon 


The unexpected discovery ofa hole in the atmospheric ozone 
layer over the Antarctic revolutionized science — and helped 
to establish one of the most successful global environmental 


policies of the twentieth century. 


In 1985, Joe Farman, Brian Gardiner and 
Jonathan Shanklin reported’ unanticipated 
and large decreases in stratospheric ozone 
levels over the Antarctic stations of Halley 
and Faraday. Their data showed that, after 
about 20 years of fairly steady values, ozone 
levels began dropping in the austral spring 
months around the late 1970s (Fig. 1). By 1984, 
the stratospheric ozone layer over Halley in 
October was only about two-thirds as thick as 
that seen in earlier decades — a phenomenon 
that became known as the Antarctic ozone 
hole. Farman et al. boldly suggested a link 
to human use of compounds called chloro- 
fluorocarbons (CFCs), often used in aerosol 
cans and cooling devices suchas fridges. Their 
findings transformed the fields of atmos- 
pheric science and chemical kinetics, and led 
to global changes in environmental policy. 
The stability of the stratospheric ozone layer 
has attracted the interest of scientists, the pub- 
lic and policymakers for more than 50 years 
because this layer protects life on Earth’s sur- 
face from biologically damaging ultraviolet 
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radiation. The potential for pollutants known 
as nitrogen oxides to deplete global ozone 
prompted much research’ on the influence 
of aviation on the ozone layer®. A study* in 
1974 suggested that chlorine monoxide 
(CIO) produced from CFCs might similarly 
deplete ozone. By the early 1980s, the best 
projections from stratospheric models indi- 
cated that continuing production of CFCs at 
then-current amounts risked the destruction 
of only about 2-4% of the ozone layer by the 
end of the twenty-first century’. There was no 
suggestion that ozone at polar latitudes would 
be especially sensitive. 

The expected depletion was relatively 
small and far in the future, but posed serious 
threats, including increased incidence of 
skin cancers and ecological damage. Inter- 
national policymakers therefore concluded 
that a cautious ozone-protection strategy was 
needed, and, in March1985, the United Nations 
Vienna Convention for the Protection of the 
Ozone Layer was signed. It called for more 
ozone research, but contained no legally 
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Figure 1| Ozone over Antarctica. a, 1n 1985, Farman etal.' reported that stratospheric ozone levels over 

the Halley and Faraday stations in Antarctica during the austral spring had declined greatly from previously 
steady values. The graph shows the Halley times series, extended to 2016. b, Subsequent satellite monitoring 
revealed that the area of ozone depletion — the ozone hole — extended over a vast region. This map shows a 
satellite ozone map for 10 September 2000, when ozone depletion was close to its maximum: blue indicates 
low ozone levels; red, high levels. The position of the Halley station is indicated. 
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binding goals for CFC reductions>. 

Farman and colleagues’ report of a loss 
of one-third of the springtime ozone layer 
over Antarctica was published a few months 
later. The paper’s strengths were the authors’ 
careful analysis of the seasonal character of 
the change, and the fact that changes were 
detected using two different instruments. 
The authors suggested that Antarctica’s 
extremely cold temperatures during winter 
and spring made the region “uniquely sensi- 
tive to growth of inorganic chlorine” produced 
in the atmosphere from CFCs, although the 
chemical mechanism they proposed was incor- 
rect. The careers of hundreds of scientists and 
dozens of diplomats worldwide were abruptly 
transformed by this single paper. 

At that time, the atmospheric chemistry of 
the Antarctic was terra incognita. Measure- 
ments needed to be made both at ground level 
and from aircraft to understand whether CFCs 
had arole in producing the ozone hole. Sci- 
entists were energized and excited to attack 
the challenge. 

Iwas fortunate to be among a group of scien- 
tists who went tothe US station at McMurdoin 
1986, where the first Antarctic measurements 
of CIO (ref. 6) and of another CFC-derived 
ozone-depleting compound, chlorine dioxide 
(OCIO) (ref. 7), were obtained. These com- 
pounds were roughly 100-fold more abundant 
than elsewhere. The ‘smoking gun’ for the role 
of CFCsin ozone depletion came from aircraft 
measurements taken in 1987. They revealed’ a 
dramatic enhancement in ClO levels (compa- 
rable to those at McMurdo) and a co-located 
decrease in ozone concentrations as the plane 
flew south from Chile into the Antarctic. 

These independently obtained data sets 
indicated that the Antarctic was indeed 
uniquely sensitive to chlorine compounds’, 
as Farman et al. had suggested. Unusual 
changes in atmospheric abundances of related 
chemicals were also measured’®. Moreover, 
satellite monitoring confirmed that depletion 
extended over a vast region (typically up to 
about 20 million square kilometres; see ref. 10, 
for example). 

The response of policymakers to Farman 
and colleagues’ paper was initially cool. In 
my view, this was because they did not want 
to upset the apple cart of the delicate diplo- 
macy embarked on with the Vienna conven- 
tion until it was clear that the science was 
correct. Nevertheless, they argued that pre- 
cautionary principles were part of the conven- 
tion, and — even as the research planes were 
flying from Chile — signed the 1987 Montreal 
Protocol on Substances that Deplete the 
Ozone Layer. This was an agreement to freeze 
production and consumption of ozone-de- 
pleting substances at then-current rates, and 
to meet over time to consider whether to 
decrease production. 

But the signing of global environmental 
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agreements is only a ceremonial first step; 
they must subsequently be ratified and 
strengthened over time’. I believe that Farman 
and colleagues’ paper led to the remarkably 
fast ratification of the protocol in 1989, and 
to later amendments (beginning with the 
London Amendment in 1990) that included 
ever-tightening restrictions on the global pro- 
duction and consumption of ozone-depleting 
substances. 

So why was the ozone hole not seenincom- 
putational simulations of the stratosphere? 
It turned out that the models lacked a key 
ingredient: by considering only gas-phase 
atmospheric chemistry, they overlooked 
the activation of ozone-destroying chlo- 
rine species that occurs on and within polar 
stratospheric cloud particles at extremely 
low temperatures”. The discovery of the 
missing ingredient drew physical chemists 
in increasing numbers to study the surface 
chemistry involved”. Previously unknown gas- 
phase reactions associated with ozone deple- 
tion were also identified, particularly those 
involving a ClO dimer (see ref. 10, for example). 
Laboratory and field studies were carried out, 
and microphysical models were developed 
(see ref. 14, for example), to determine what 
polar stratospheric clouds are made of: ice, 
nitric acid hydrates or supercooled liquids. 
The answer was that they could be all three, 
depending on temperature and the histories 
of the sampled air parcels. 

Ground-based and airborne missions to 
understand Arctic ozone chemistry” were also 
inspired by Farman and colleagues’ paper and 
related studies. It emerged that ozone loss in 
the Arctic is generally much less severe thanin 
the Antarctic, broadly because temperatures 
inthe region are warmer as a result of meteor- 
ological differences between the two regions. 
The coupling of chlorine-containing species 
with bromine-containing ones was found to 
be a key ingredient in polar ozone depletion, 
especially in the Arctic’. 

Atmospheric modelling also progressed 
to simulate the newly discovered processes, 
evolving from two dimensions (latitude- 
altitude) to three (latitude-altitude-longi- 
tude), to better represent global stratospheric 
temperatures, winds and circulation”. Dynam- 
ical studies have shown that the ozone hole 
influences Antarctic winds and temperatures 
not just in the stratosphere, but also in the 
underlying troposphere, and there is evidence 
for climate connections at other latitudes”. 
Modern global climate models therefore 
include increasingly detailed representations 
of stratospheric chemistry and dynamics. The 
ozone hole has thus inspired a new generation 
of scientists to probe climate-chemistry 
interactions, forging connections between 
previously separate disciplines. 

The Montreal Protocol led to global CFC 
production and consumption phase-outs 


by 2010, and now the Antarctic ozone hole 
is slowly healing’®. The protocol thus pre- 
vented the ozone layer from collapsing” and 
is asignature success story for global environ- 
mental policy. Because CFCs have atmospheric 
lifetimes of 50 years or more, the atmosphere 
will not fully recover until after 2050, even 
in the absence of further emissions. 

However, recent work” provides strong 
evidence of the continuing production and 
release of one type of CFC (trichlorofluoro- 
methane). The source is not large enough to 
reverse the healing of the ozone hole, but it 
is slowing recovery and shows that there is 
still a need for scrutiny in this field. Research 
into, and policy to protect, the stratosphere 
will thus continue to be inspired by Farman 
and colleagues’ research — and will probably 
do so until the ozone hole finally closes. 
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The advent and rise of 
monoclonal antibodies 


Klaus Rajewsky 


A1975 Nature paper reported how cell lines could be made 
that produce an antibody of known specificity. This discovery 
led to major biological insights and clinical successes in 


treating autoimmunity and cancer. 


In their 1975 Nature paper’, the immunologists 
Georges Kohler and César Milstein described 
the production of monoclonal antibodies of 
predetermined specificity, each made by a 
continuously growing cell line that had been 
generated by the fusion of an antibody-pro- 
ducing cell from an immunized mouse with 
an immortal cancer cell specialized for anti- 
body secretion. Hearing from César about 
this work before it was published, on the way 
to an obscure meeting in San Remo in Italy, I 
knew immediately that our research field had 
reached a turning point. 

Antibodies were discovered in 1890 by the 
physiologist Emil von Behring and the micro- 
biologist Shibasaburo Kitasato as protective 
antitoxins in the blood of animals exposed to 
diphtheria or tetanus toxin’. Ever since, anti- 
bodies have been a major research subject, 
given their key role in adaptive immunity 
(specific immune responses against, for exam- 
ple, invading disease-causing agents) and 
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their wide range of specificities, essentially 
covering the universe of chemical structures. 
This had stood out from early on as a major 
genetic puzzle. How can our limited genome 
encode a seemingly limitless repertoire of 
specificities? And in medical (and industrial) 
practice, antibodies have been used ever since 
their discovery as the basis for serum therapy 
(the treatment of infectious diseases using 
blood serum from immunized animals), as 
diagnostic tools to monitor infectious disease, 
and in innumerable other contexts. 

But antibodies specific for any given mol- 
ecule (called an antigen in the context of an 
antibody response) came, with a few notable 
exceptions, as mixtures of antibodies, pro- 
duced by thousands of antibody-producing 
cells in an immunized animal or infected 
person. Each of these cells produced an 
antibody of its own kind, so that ‘antibody 
specificity’ usually referred to the properties 
of antibody populations rather than those 
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Figure 1 | The production of monoclonal antibodies. Kohler and Milstein’s 1975 Nature paper' solved 

the problem of how to generate clones of continuously dividing cells that make antibodies of aknown 
specificity. The ability to generate such monoclonal antibodies revolutionized antibody research and 
paved the way to clinical advances. The authors injected mice with sheep red blood cells and isolated spleen 
cells, including those that produce antibodies. Different antibody colours indicate antibodies specific 

for different molecules (antigens), and produced by different cells. The authors had the idea of fusing 
antibody-producing spleen cells of limited lifespan with myeloma cells — immortal cancerous immune cells 
secreting antibodies of unknown specificity. Spleen cells that had been activated upon antigen recognition 
fused preferentially with the myeloma cells, generating hybrid cells called hybridomas. Unlike unfused cells, 
the hybridoma cells could grow on the selective agar plates used, and formed colonies of identical cells. 
Hybridomas that secreted antibodies specific for sheep red blood cells were identified by their ability to 
destroy such cells when added to the agar, generating a clearance (plaque). These original hybridoma cells 
made two types of antibody, one that recognized sheep red blood cells and another of unknown specificity. 


of individual antibodies. The inability to 
produce molecularly defined, homogeneous 
antibodies of predetermined specificity was 
a major hurdle that needed to be overcome. 
This changed overnight with Kohler and 
Milstein’s paper. Kohler had joined Milstein’s 
group at the MRC Laboratory of Molecular 
Biology in Cambridge, UK, as a postdoc, to 
study the mechanism of somatic mutation that 
operates in antibody diversification. The plan 
was to use mouse myeloma cells for this pur- 
pose. These are tumour cells originating from 
antibody-secreting immune cells. The cancer 
immunologist Michael Potter at the National 
Cancer Institute in Bethesda, Maryland, had 
shown years before that myelomas could be 
induced in a particular mouse strain by the 
injection of mineral oil’. The Milstein team 
was propagating and fusing to each other 
cells obtained from cell lines derived from 
various such tumours. However, the mye- 
loma antibodies were ill-defined in terms 
of specificity. Could one perhaps fuse anti- 
body-producing cells from immunized mice 
to myeloma cells, to produce continuously 
dividing cells that make antibodies specific 
for the immunizing antigen? To detect such 
fused cells, an approach offered itself which 
Kohler had become acquainted with during his 
PhD at the Basel Institute for Immunology in 
Switzerland and that had been developed by 
the institute’s director, Niels Jerne*. This was a 
simple technique in which cells secreting anti- 
bodies in response to, and specific for, sheep 
red blood cells (SRBCs) can be identified by 
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the formation of aclearance (called a plaque) 
in SRBC-containing agar plates. 

With this, the stage was set for the K6hler-— 
Milstein experiment (Fig. 1). Large numbers 
of plaque-forming hybrid cells secreting anti- 
SRBC antibodies appeared when spleen cells 
from SRBC-immunized mice were fused with 
myeloma cells. The fused cells had acquired 
expression of a single type of anti-SRBC anti- 
body from a spleen cell and preserved the 
immortality and high rate of antibody secre- 
tion of the myeloma fusion partner. Myeloma 
and spleen cells were unable to multiply 
under the chosen experimental conditions, 
and the myeloma cells apparently preferred 
antigen-activated spleen cells over others for 
fusion, a prerequisite for the striking success 
of the experiment. 

The fused cells could be cloned and propa- 
gated indefinitely as what were later termed 
hybridomas, producing unlimited amounts of 
monoclonal antibodies. The first-generation 
hybridomas secreted two types of antibody: 
the desired one, plus an antibody of unknown 
specificity originating from the myeloma 
fusion partner. But this two-antibody problem 
was soon solved through the isolation of mye- 
lomalines that had lost antibody expression”. 

Antibodies against any desired antigen 
could now be generated, investigated and 
used as homogeneous molecular entities. 
In 1984, Kohler and Milstein won the Lasker 
Award together with Potter, and that same year 
Kohler, Milstein and Jerne were awarded the 
Nobel Prize in Physiology or Medicine. 
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The impact of the K6hler—Milstein paper 
on biomedical and, specifically, immuno- 
logical research was dramatic, propelled by 
scientific developments that occurred around 
the time the paper appeared. Thus, it became 
clear shortly afterwards that the variable and 
constant regions of antibodies are encoded 
by separate gene segments. Antibody diver- 
sity arises when somatic recombination joins 
gene segments together, and when a subse- 
quent process called somatic hypermutation 
operates, during the course of the antibody 
response, onthe recombined gene segments 
encoding antibody variable regions. Together, 
these mechanisms generate a vast repertoire 
of antibody specificities, as well as distinct 
classes of antibody, which mediate their 
various roles (effector functions) through 
their differing constant regions. 

These insights were accompanied by the 
explosive development of new molecular and 
genetic tools that allowed the isolation and 
manipulation of antibody genes in multiple 
ways. Together with the hybridoma technology, 
they fuelled a rapidly growing and still expand- 
ing field ofinvestigation, in which basic research 
on antibody diversification and effector func- 
tion goes hand-in-hand with the production 
and engineering of monoclonal antibodies 
for diagnostic and therapeutic purposes. 

In the early days, the production of mono- 
clonal antibodies was entirely based on 
hybridoma technology and used for two main 
purposes: to study the somatic evolution of 
the antibody repertoire and the molecular 
basis of antibody specificity; and to generate 
reagents that bind to specific proteins or other 
molecules expressed by cells of the body or 
by pathogens. In both cases, completely new 
insights and technical advances resulted. Thus, 
affinity maturation of antibodies (theincrease 
of antibody affinity during the course of an 
antibody response) began to be understood 
at the molecular level. And the technique of 
fluorescence-activated cell sorting was revo- 
lutionized by monoclonal antibodies, allowing 
the separation of different cell types at an 
unprecedented level of specificity and reso- 
lution. Recent highlights in this area include 
approaches allowing gene-expression profiling 
of single cells that have been characterized by 
the expression of large arrays of surface-marker 
proteins through cocktails of DNA-tagged, ‘bar- 
coded’ monoclonal antibodies’. 

In medicine, monoclonal antibodies have 
an ever-increasing role and have generated a 
multibillion-dollar market, whichis expected 
to grow substantially in the future. In addition 
to their impact on medical diagnosis, the ther- 
apeutic application of antibodies has led to 
spectacular successes in the treatment of 
autoimmune diseases and cancer. The 2018 
Nobel Prize in Physiology or Medicine was 
awarded for the “discovery of cancer therapy 
by [antibody-mediated] inhibition of negative 


immune regulation”. As often happens in 
biology, both the mechanisms and the effi- 
cient induction of the inhibitory processes 
underlying this type of immunotherapy are 
still unclear, with ongoing research provid- 
ing challenges and new perspectives that 
are driving the development of monoclonal 
antibodies against additional targets. 
Monoclonal antibodies are also being 
developed to control infectious diseases — fol- 
lowing the concept of protective antibodies 
that goes back to von Behring and Kitasato. 
Prevalent diseases such as malaria, influenza 
and AIDS call for the development of what 
are termed broadly neutralizing monoclonal 
antibodies, which, applied individually or in 
cocktails, might provide broad protection*®. 
Intensive work in this direction has yielded 
promising results, including engineering anti- 
body specificity through the substitution of 
variable domains by ligand-binding domains 
fromnon-antibody receptors’. Yet the immune 
system itself uses similar tricks’° and, by and 
large, antibody design is still unable to outdo 
it in terms of generating and selecting anti- 
body specificities". Nevertheless, the mani- 
fold modern molecular, cellular and genetic 
approaches to selecting and engineering 
antibodies have had, and continue to have, 
a tremendous impact on the field, whether 
by producing partly or fully human antibod- 
ies of different classes, making bi-specific 
or toxin-conjugated antibodies for specific 
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therapeutic purposes, or incorporating 
antibody variable regions into chimaeric anti- 
genreceptors onT cells for useinan anticancer 
treatment called CAR-T cell therapy. 
Monoclonal antibodies are nowadays 
often generated by isolating or transform- 
ing antibody-producing cells taken directly 
from immunized animals or patients, and 
transplanting the antibody-encoding genes 
of these cells into suitable producer cell lines, 
rather than using hybridoma technology” “*. 
But they started their spectacular career in 
1975, secreted by hybridoma cells in Kohler 
and Milstein’s SRBC-containing agar plates. 


Klaus Rajewsky is at the Max-Delbriick-Center 
for Molecular Medicine in the Helmholtz 
Association, 13125 Berlin, Germany. 
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The nano-revolution 
spawned by carbon 


Pulickel M. Ajayan 


In 1985, scientists reported the discovery of the cage-like 
carbon molecule C,,. The finding paved the way for materials 
suchas graphene and carbon nanotubes, and was alandmark in 


the emergence of nanotechnology. 


The history of the carbon molecule Cg, 
highlights the fact that discoveries do not 
happen in a predefined sequence. C,o, car- 
bon nanotubes and graphene (single layers 
of graphite) are essentially members of the 
same family: all are nanoscale structures that 
consist of carbon atoms arranged ina periodic 
crystal lattice. Graphite has been known for 
a few hundred years, and individual layers of 
the material could be separated easily. How- 
ever, the identification of C,, by Kroto etal. 
did not occur until 1985. This, in turn, led to 
the discovery of graphene nearly two decades 
later”. Both of these breakthroughs led to 


Nobel prizes, in chemistry for C,, (1996) and 
in physics for graphene (2010). 

The discovery of C,, occurred on the cam- 
pus of Rice University in Houston, Texas. Eiji 
Osawa, a Japanese theoretical chemist, had 
predicted? the stable structure of a 60-atom 
carbon molecule in 1970, but this finding did 
not come to the attention of the mainstream 
scientific community. Experimental results 
from mass spectrometry were also beginning 
to emerge, showing the stability of 60-atom 
carbon clusters. However, no one made the 
connection that these clusters would have 
the structure that Osawa had predicted. It 
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150 years ago 


Aphorisms by Goethe — the opening 
article of the first issue of Nature, 
4 November 1869. 


Nature! We are surrounded and embraced 
by her: powerless to separate ourselves 
from her, and powerless to penetrate 
beyond her. Without asking, or warning, 
she snatches us up into her circling dance, 
and whirls us on until we are tired, and drop 
from her arms. She is ever shaping new 
forms: what is, has never yet been; what 
has been, comes not again. Everything is 
new, and yet nought but the old ... 

So far Goethe. 

When my friend, the Editor of NATURE, 
asked me to write an opening article for his 
first number, there came into my mind this 
wonderful rhapsody on “Nature”, which has 
been a delight to me from my youth up. It 
seemed to me that no more fitting preface 
could be put before a Journal, which aims 
to mirror the progress of that fashioning 
by Nature of a picture of herself, in the 
mind of man, which we call the progress of 
Science. 

[In a letter to Chancellor von Muller] 
Goethe says, that about the date of this 
composition of “Nature” he was chiefly 
occupied with comparative anatomy; 
and in 1786, gave himself incredible 
trouble to get other people to take an 
interest in his discovery, that man has a 
intermaxillary bone. After that he went 
on to the metamorphosis of plants; and 
to the theory of the skull; and, at length, 
had the pleasure of his work being taken 
up by German naturalists. The letter 
ends thus :—“If we consider the high 
achievements by which all the phenomena 
of Nature have been gradually linked 
together in the human mind ... we shall, not 
without a smile ... rejoice in the progress of 
fifty years.”... 

When another half-century has passed, 
curious readers of the back numbers of 
NATURE will probably look on our best, 

“not without a smile;” and, it may be, that 
long after the theories of the philosophers 
whose achievements are recorded in 
these pages, are obsolete, the vision of the 
poet will remain as a truthful and efficient 
symbol of the wonder and the mystery of 
Nature. 

T. H. Huxley 
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10 extraordinary papers 


a Coo b Carbon nanotube 


Cc Graphene 


Figure 1 | Three major nanoscale carbon structures discovered in the past 35 years. a, In 1985, Kroto etal.’ 
reported the discovery of the molecule C,o. It has a cage-like structure that consists of 12 pentagonal and 20 
hexagonal faces. b, Following Kroto and colleagues’ work, carbon nanotubes were first produced” in 1991. A 
carbon nanotube can be thought of as a 2D hexagonal lattice of carbon atoms that is rolled up to forma hollow 
cylinder. c, In 2004, scientists reported the isolation of graphene? — a single layer of carbon atoms in a2D 


hexagonal lattice. 


was against this backdrop that the visit of the 
British chemist Harry Kroto tothe laboratories 
of Rice scientists Richard Smalley and Robert 
Curl proved significant. 

Kroto was an expert in molecular spectros- 
copy and had an interest in the molecules 
that exist in interstellar space. He proposed 
asimple mechanism for the formation of the 
small carbon-chain molecules that had been 
observed in interstellar gas clouds, and sug- 
gested that this idea could be tested using 
Smalley’s experimental apparatus. Smalley, 
Curland their students were making many dif- 
ferent atomic clusters, suchas those of silicon, 
through ablation — the removal of material 
from the surface of a target — and were analys- 
ing the masses of these clusters in detail. After 
some delay, Kroto’s proposal was accepted and 
he journeyed to Houston. 

In previous work by other groups‘, a peak 
corresponding to C,, was somewhat promi- 
nent in mass spectra. During the experiments 
at Rice to test the mechanism of carbon-chain 
formation, it became clear that the C,, peak 
could be made extremely strong under certain 
conditions. However, the structure of the C,, 
molecule was the main puzzle that needed to 
be solved. The team accomplished this task, 
and published the first report in 1985. 

The structure of C,, turned out to be a 
beauty (Fig. 1a). It looked exactly like the clas- 
sic design of a football (soccer ball). More pre- 
cisely, the structure is about 0.7 nanometres 
across and is a truncated icosahedron — a pol- 
yhedron that has 12 pentagonal and 20 hexag- 
onal faces. This highly symmetric, cage-like 
shape was first described by Archimedes, and 
the rules that guide the topology of polyhedra 
were first developed by Descartes. 

When applied to polyhedra that are made 
of only pentagons and hexagons, these rules 
imply that every such closed structure can con- 
tain any number of hexagons but must have 
exactly 12 pentagons. Heptagons can also be 
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introduced, producing negative curvature 
(saddle-shaped surfaces), but the topologi- 
cal effect of aheptagon is cancelled by that of 
a pentagon. In the mid-eighteenth century, 
the Swiss mathematician Leonhard Euler 
had proposed a formula’ for these geometric 
rules, which were now profoundly manifested 
at the nanoscale in C,,. Larger closed carbon 
cages (such as C,, and C,,) also exist, and can 
be formed by simply adding more hexagons 
to the cage. 

The family of C,, and larger molecules have 
come to be knownas the fullerenes, after the 
US architect Buckminster Fuller. Fuller had 
become famous for designing stable domes 


“Thestructure of C,, turned 
out tobea beauty. It looked 
exactly like the classic design 
of asoccer ball.” 


and buildings® that have shapes similar to 
that of C,). The correspondence was striking, 
although the scale differed by a factor of 
about 10 billion. So it was that the C,, family 
got its name (its members could well have 
been called soccerenes). 

Kroto and colleagues’ fullerene discovery 
took other scientists by surprise. Initially, 
there were quite a few sceptics; many thought 
that C,, was flat rather than cage-like. How- 
ever, this perception changed after work by 
the German chemist Wolfgang Kratschmer, the 
US chemist Donald Huffman and their stu- 
dents. In 1990, these researchers succeeded’ 
inisolating C,, molecules from carbon sootin 
bulk, thereby making the substance available 
for large-scale experiments. 

The fullerene discovery immediately had 
two major consequences. First, fullerenes were 
used to synthesize a large variety of unconven- 
tional materials. For example, endohedrals® 
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(fullerenes that enclose metal atoms), 
fullerene-assembled solids and superconduct- 
ing fullerene materials? were produced and 
characterized with excitement. Fullerenes 
were seen as a distinctive, stable molecular sys- 
tem andas an ideal building block for making 
unprecedented materials. They were also 
touted as a new allotrope (structural form) 
of carbon that deviated from the familiar 
graphite and diamond. 

Second, the discovery provided the 
impetus to seek other carbon allotropes — 
particularly nanoscale materials. The most 
substantial result from this search was the 
synthesis and development of carbon nano- 
tubes (Fig. 1b) by the Japanese physicist 
Sumio lijima’® and colleagues” in the early 
1990s. Carbon nanotubes showed that the 
electronic structures of carbon layers could 
betuned by structural nanoscale engineering, 
suggesting possible uses in electronics and 
other applications. 

Over the following two decades, a rush of 
research activities, publications and patents 
would make fullerenes and carbon nanotubes 
the poster children of nanotechnology. It 
was also during this period, in 2004, that the 
Russian physicists Andrei Geim and Konstan- 
tin Novoselov isolated graphene? (Fig. 1c). 
Graphene was the first example of a truly 
stable 2D material and revealed the physics 
associated with such 2D systems. 

It has been nearly 35 years since Kroto and 
colleagues’ fullerene paper was published. 
In spite of all the potential that fullerenes 
promised, these molecules have not led to any 
major applications, barring a few encouraging 
ideas in solar cells and biochemistry. However, 
the work paved the way for many innovations 
innanomaterials that will ultimately find uses 
in nanotechnology. The fullerene discovery 
and what followed show the ingenuity of the 
human mind in solving a nanoscale puzzle. 
Fullerenes also providea curious case in which 
an architect’s name was dragged into a major 
scientific discovery. Buckminster Fuller would 
probably not have minded. 


Pulickel M. Ajayan is in the Department of 
Materials Science and NanoEngineering, 
Rice University, Houston, Texas 77006, USA. 
e-mail: ajayan@rice.edu 


1. Kroto, H. W., Heath, J. R., O’Brien, S. C., Curl, R. F. & 
Smalley, R. E. Nature 318, 162-163 (1985). 

2. Novoselov, K. S. et al. Science 306, 666-669 (2004). 

3. Osawa, E. Kagaku 25, 854-863 (1970). 

4. Rohlfing, E. A., Cox, D. M. & Kaldor, A. J. Phys. Chem. 81, 
3322-3330 (1984). 

5. Euler, L. Novi Comm. Acad. Sci. Imp. Petropol. 4, 109-140 
(1758). 

6. Sieden, L. S. Buckminster Fuller’s Universe: His Life and 
Work (Basic, 2000). 

7. Kratschmer, W., Lamb, L. D., Fostiropoulos, K. & 
Huffman, D. R. Nature 347, 354-358 (1990). 

8. Chai, Y. et al. J. Phys. Chem. 95, 7564-7568 (1991). 

9. Hebard, A. F. et al. Nature 350, 600-601 (1991). 

10. lijima, S. Nature 354, 56-58 (1991). 

11. Ebbesen, T. W. & Ajayan, P. M. Nature 358, 220-222 (1992). 


Readers respond 


Correspondence 


Darwin brokered 
others’ publications 


Charles Darwin was not just an 
occasional contributor who 
used Nature to share his own 
findings and discussions (see 

Y. Liu Nature 574, 36; 2019). 

He also gave voice to naturalists 
around the world, at atime 
when the journal was not easily 
accessible to the international 
scientific community. 

Roughly half of Darwin’s 
contributions to Nature were 
transcripts (with due credit) of 
reports and findings sent to him. 
The writers of the letters were 
from Brazil, the United States, 
Peru and Poland. His most 
frequent correspondent was 
Johann Friedrich Theodor (Fritz) 
Miller, aGerman scientist who 
emigrated to southern Brazil 
in 1852. Miller, like several of 
Darwin’s interlocutors, tested 
and observed facts described in 
On the Origin of Species (1859). 

Miller’s support for the 
theory of evolution was 
expressed in his book Fiir Darwin 
(1864). Darwin considered it of 
such importance that he himself 
sponsored its translation into 
English, published the year of 
Nature's launch, 1869, under 
the title Facts and Arguments 
for Darwin. This initiated a 
17-year friendship between the 
two naturalists, documented 
by intensive correspondence. 
Between 1874 until the year 
before his death in 1882, 
Darwin transcribed seven of the 
scientific reports he received 
from Miller and submitted 
them to Nature. 


Antonio C. Marques, Klaus 
Hartfelder University of 
Sao Paulo, Brazil. 
marques@ib.usp.br 


Scientism and the 
abuse of science 


The philosopher Friedrich 
Hayek originally popularized 
the term ‘scientism’ in his 1979 
book The Counter-Revolution 
of Science as asynonym for 
pseudoscience. The word later 
came to represent the expansion 
of science into domains 

where it really has nothing 

to say, such as evolution into 
atheism. Nathaniel Comfort 
now positions ‘scientism’ as the 
abuse of science in ways that 
obscure today’s concerns for 
equity, inclusion and diversity 
(see N. Comfort Nature 574, 
167-170; 2019). 

Comfort condemns this 
version of scientism, in which 
practices and policies endorsed 
by scientists have had adverse 
consequences for vulnerable 
groups in society — although 
heis careful not to brand the 
scientists involved as malicious 
or ignorant. The implication 
is that history should help to 
ensure that such scientism will 
not happen again. 

In my view, it is a misuse of 
history to oversee the future. 
What counts as good and bad 
in scientific practice or in 
science-based policies can be 
understood only in retrospect, 
because our judgement 
depends on witnessing the 
consequences. As we move 
forward in history, those 
judgements will change. It 
follows that the moral character 
of any action is indeterminate 
at the time it happens. 

Science itself is a quantum 
phenomenon — and ‘scientism’ 
is its observer effect. 


Steve Fuller University of 
Warwick, Coventry CV4 7AL, UK. 
s.w.fuller@warwick.ac.uk 


Engage egg donors 
in editing debate 


We argue that egg donors 
should be more involved in 
discussions on the ethical 
aspects of human germline 
gene editing (see Nature 574, 
465-466 (2019) and E. S. Lander 
et al. Nature 567, 165-168; 2019). 

Experimental data from large 
numbers of human embryos 
could be necessary to refine 
and improve germline gene 
editing, as well as to evaluate the 
technique’s safety and efficacy. 
Moreover, studies involving 
the creation of embryos seem 
preferred for testing for specific 
mutations and to reduce 
mosaicism (H. Maet al. Nature 
548, 413-419; 2017). This means 
that oocytes will have to be 
procured from large numbers 
of women. 

Oocyte harvesting exposes 
the donors to serious short- 
and long-term health risks, 
raising questions about 
the ethical acceptability of 
experiments that require this 
procedure. Although donors 
are often compensated for the 
inconvenience, the practice 
prompts concerns about undue 
inducement — particularly 
for financially vulnerable 
women. The ethical issues are 
exacerbated because it is by 
no means certain that clinical 
applications of germline gene 
editing will eventually be 
permitted. 

Above and beyond the 
physical risks, these wider 
ethical and policy issues should 
be made clear to potential 
donors so that they can make 
aninformed choice and havea 
chance to be properly engaged 
in the debate. 


Emilia Niemiec, Heidi Carmen 
Howard Uppsala University, 
Sweden. 
heidi.howard@crb.uu.se 


How will we fund 
open-access fees? 


The international Plan S 
research-funder consortium 
cOAlition S proposes that 
institutional libraries should 
transition from subscription to 
‘pure publish’ deals with open- 
access journals by 2024 (see 
Nature 572, 586; 2019). However, 
the coalition represents just 

16 European funding agencies 
and 3 international charity 
foundations. Many other 
European funders are notina 
position to pay open-access 
publication fees on behalf of 
their researchers. 

For example, Denmark’s 
14,000 private foundations that 
currently support half of the 
country’s research are stretched 
to the limit. Their researchers 
will therefore have no choice 
but to pay the bill out of their 
own research grants, which are 
already under intense pressure 
from spiralling costs. 

Remedial action is urgently 
needed if publication and 
knowledge flow are not to be 
skewed towards the wealthiest 
countries and universities. 

For example, national or 
European Union funds could 
be established to help cash- 
strapped researchers cover 
their publishing costs. 


Christian Sonne, Rune Dietz, 
Aage K. O. Alstrup Aarhus 
University, Denmark. 
cs@bios.au.dk 


HOW TO SUBMIT 


Correspondence may be 
submitted to correspondence@ 
nature.com after consulting the 
author guidelines and section 
policies at go.nature.com/ 
cmchno. 
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Competition 


Prize essays 


The winners of our young- 
writer essay competition 


qutlta, 


The contest for 18-25-year-olds received 
more than 660 entries from 68 countries. 
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n May this year, as part of our 150th 

anniversary, Nature asked readers aged 

between 18 and 25 to enter an essay com- 

petition. The task was to tell us, inno more 

than 1,000 words, what scientific advance 
they would most like to see in their lifetimes, 
and why it mattered to them. 

The response was phenomenal: we received 
66lentries. Some entrants hoped that science 
would make their lifetimes much longer than 
they can currently expect. Many looked for- 
ward to work that will end climate change. 
Others wanted to see advances in neuro- 
degenerative disease, our understanding of 
human history, crop growth, space explora- 
tion, medical technologies, water resources 


ILLUSTRATIONS BY JAN KALLWEJT 


or superfoods. The standard of writing was 
impressive, and the scope of ideas inspiring. 

The winner is a compelling essay by Yasmin 
Ali,a PhD student at the University of Notting- 
ham, UK. Ali submitted a thought-provoking 
piece on Beethoven, her brother’s hearing loss 
and thescience she hopes will one day cure it. 
It stood out to the judges as areminder of why 
many scientists do research: to make the world 
better tomorrow than it is today. 

All essays were judged by a group of Nature 
editors. The top ten submissions were then 
ranked by three members of a separate judg- 
ing panel: Magdalena Skipper, editor-in-chief 
of Nature; Faith Osier, an immunologist and 
researcher at the KEMRI-Wellcome Trust 


Research Programme in Kilifi, Kenya; and 
Jess Wade, a physicist at Imperial College 
London. All submissions were kept anony- 
mous throughout the process. 

We also selected two runners-up. Physicist 
Robert Schittko at Harvard University in Cam- 
bridge, Massachusetts, proposes that nuclear 
fusion could offer a solution to the climate 
crisis, in a piece that effortlessly mixes grand 


ambition with gentle humour. And chemist 
Matthew Zajac at the University of Chicago 
in Illinois wrote a powerful personal account 
of why he wants to see advances in the field of 
same-sex reproduction. 

The results show that today’s young scien- 
tists have a wealth of ideas, talent and convic- 
tion that research can transform their world. 
We look forward to seeing what they do next. 


Beethoven’s dream 


The composer wished for acure for his hearing loss. 
Soon, research could make it a reality for my twin 
brother — and millions more. By Yasmin Ali 


n 1802, under a June Sun, a 31-year-old 
Beethoven paced through the countryside 
around Vienna. Rays of sunshine pierced 
through the trees, the hard soil crunched 
beneath his feet and birds conducted their 
own orchestra. But Beethoven didn’t marvel at 
these details; he was preoccupied by thoughts 
of suicide. Some years earlier, he had started 
to lose his hearing, and although it wasn’t yet 
severe, he was still struggling immensely with 
his condition. Living with hearing loss made 
his life a “wretched existence” that drove him 
into despair, he wrote. He still persevered with 
his work, and went onto create timeless music. 
But he found little joy in the process. 

l observed a similar struggle at first hand, 
as my twin brother Islam, when we were 18 
years old, started to lose his hearing. I noticed 
changes in his personality, too. He was always 
the outgoing troublemaker, but became quiet 
and withdrawn. Because hearing loss isn’t vis- 
ible, | didn’t know what he was going through, 
which also made it difficult for me to be there 
for him. 

Today, 466 million people worldwide have 
disabling hearing loss, and over 900 million are 
expected to have it by 2050, according to the 
World Health Organization. Its impact is often 
underestimated compared with other disabil- 
ities, but people with hearing loss constantly 
experience communication difficulties in their 
everyday lives. They often mishear speech and 
find it very difficult to follow conversations. 
These miscommunications can lead to indi- 
viduals feeling isolated as they struggle to take 
part in social interactions, ultimately leading 
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them to withdraw from society. As Helen Keller 
once wrote: “Blindness cuts us off from things, 
but deafness cuts us off from people.” 

To this day, there is still no cure for sen- 
sorineural hearing loss (the most common 
type, and the one Beethoven had). We have 
advanced technological devices that amplify 
sound, such as hearing aids and cochlear 
implants, but these still don’t restore hearing. 
In my and my brother’s lifetimes, I’d like to see 
research make that possible. 

Sensorineural hearing loss occurs asa result 
of damage to the inner ear organ, called the 


“Ifit works, such ascientific 
advance could transform 
hearing health care.” 


cochlea, which has intricate sound-sensing 
hair cells that are responsible for hearing. In 
humans and other mammals, any damage to 
hair cells is irreversible. Other animals, such as 
birds, fish, amphibians and reptiles, canspon- 
taneously regenerate their cochlear hair cells, 
meaning that any hearing loss they developis 
only temporary. 

Scientists have been studying the regener- 
ation process of hair cells in non-mammals, 
and have identified various genes and proteins 
that have central roles. These can be targeted 
to stimulate support cells in the cochlea to in 
turn create more hair cells and replace those 
that died. 

Some of these cell therapies have been 
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successful in restoring the hearing of mice and 
guinea pigs: a breakthrough! These advances 
have led to the development of more thera- 
pies, and onesuch therapy is now being tested 
for the first time in humans. The REGAIN clin- 
ical trial (REgeneration of inner ear hair cells 
with GAmma-secretase INhibitors), an inter- 
national collaboration led by researchers at 
University College London, is testing a mol- 
ecule called y-secretase inhibitor that could 
potentially restore hearing by encouraging 
supporting cells to transform into new hair 
cells themselves. 

If it works, sucha scientific advance could 
transform hearing health care as we know 
it. My own research investigates the impact 
hearing loss has on people’s mental well-being. 
Many people share Beethoven’s despair 
when they realize that their hearing can’t be 
restored. Hope is an essential element for good 
mental health. 

Other members of the deaf community see 
themselves as a cultural minority, rather than 
as a disabled group to be ‘cured’. My and other 
scientists’ research aims to help those who feel 
disadvantaged by deafness and want to be able 
to hear. 

Islam andI come from interracial parents, so 
we look very different. I have white, freckled 
skin, and his is olive (he gets perfect suntans, 
and I turn into a tomato). I have blue eyes, 
and his are hazelnut. I have normal hearing, 
whilst he has severe hearing loss. He andI have 
shared the many chapters of our lives, and 
when things became difficult as his hearing 
declined, what helped us cope was being able 
to make sense of it all together. Communica- 
tion, self-expression, hearing and being heard 
(even through sign language) are basic human 
needs. I hope that when voice support to my 
brother inthe future, that he’ll be able to hear 
it, receive it and not feel alone. 

When Beethoven lost his hearing, he 
secluded himself from society — but one thing 
that gave him strength was the hope that his 
hearing could be regained one day. But each 
medical remedy he attempted failed. In 1802, 
he wrote: “But, think that for six years now! have 
been hopelessly afflicted, made worse by sense- 
less physicians, from year to year deceived with 
hopes of improvement, finally compelled to 
face the prospect of a lasting malady (whose 
cure will take years or perhaps be impossible).” 

Beethoven’s dream of regaining his hearing 
did not come true for him, but through the sci- 
entific advance of regeneration of hair cells, it 
could become a reality 217 years after hisJune 
walk. On his deathbed, it is said that Beethov- 
en’s last words were “I shall hear in heaven!” 
Luckily for us, those facing hearing difficulties 
could soon be able to hear on Earth. 


Yasmin Ali is a PhD student studying mental 


health and hearing loss at the University of 
Nottingham, UK. 
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Power play 


Nuclear-fusion power plants could be part of 
asolution to the climate crisis. By Robert Schittko 


f primary sources can be believed, | 
conducted my first experiment with a 
high-power energy source at the tender 
age of one. 

It was New Year’s Eve 1995, and I had 
somehow gained possession of two silver 
objects I now know were screws, when my 
wandering gaze was captured by a snake-like 
item emerging froma wall. At its end, which I 
was about to learn was the head of an exten- 
sioncord, there were two tiny openings, whose 
black interiors stood out daringly against the 
white backdrop of a piece of plastic. Utterly 
unaware of the cautionary tale I was about to 
write, labandoned all hesitation. 1 took one last 
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breath, homed in on my target, and shoved the 
two silver objects into the two little holes, thus 
producing the first — but, fortunately, not the 
last — negative result of my newfound career. 

Twenty-four years later, my parents and 
I have fully recovered from our respec- 
tive shocks, | am still playing around with 
hazardous equipment — currently as a phys- 
icist at Harvard University in Cambridge, 
Massachusetts — and the mishandling of 
energy sources on a far larger scale is now 
threatening not just my existence but that of 
tens of thousands of species worldwide. Unlike 
toddler-me, today, we cannot pleadignorance. 
Even if global warming is kept to 1.5 °C above 


pre-industrial levels, the Intergovernmental 
Panel on Climate Change (IPCC) has warned, 
“climate-related risks to health, livelihoods, 
food security, water supply, human security, 
and economic growth’ willincrease. Warming 
beyond that point to just 2.0 °C will further 
harm hundreds of millions of people in vul- 
nerable areas worldwide, the IPCC estimates. 
Yet the emission levels countries have volun- 
teered to aim for following the Paris agree- 
ment will warm Earth by approximately 3.0 °C 
over the next 80 years alone, and it seems that 
even these goals will not be met. 

This failure of the global political establish- 
ment to adequately address climate change 
has prompted a hunger for some sort of trans- 
formative breakthrough, either of the political 
or of the technological kind. 

Our best hope for the former — already 
expressed ina global wave of climate activism 
— might be an unprecedented political move- 
ment which dramatically ups the pressure to 
act more determinedly in the face of a crisis. 

Our best hope for the latter is called 
nuclear fusion. 


Nuclear fusion is a process by which pairs 
of light atomic nuclei unite while releasing 
enormous amounts of energy. It is the mech- 
anism that powers the Sun and other stars, and 
aprinciple that researchers have long hoped to 
harness to build nuclear-fusion power plants. 
In theory, such plants could be fuelled with 
sustainably sourced hydrogen isotopes for 
thousands of years, while being safer than 
nuclear-fission plants and producing zero 
long-lived nuclear waste. Unfortunately, 
they also come with a catch: building them is 
incredibly hard. 

This is because nuclear fusion on Earth 
requires temperatures in the order of tens 
of millions of degrees Celsius, at which the 
fusion fuel behaves as a riotous plasma. The 
difficulty in governing the behaviour of this 
plasma is the key reason why nuclear-fusion 
power plants do not exist today, despite over 
sixty years of extensive research. Neverthe- 
less, those years have resulted in many valua- 
ble insights, and a clean-energy future thanks 
to nuclear fusion seems more realistic today 
than ever before. 

The most ambitious nuclear-fusion project 
to date, ITER, is currently being constructed 
in southern France with the explicit goal of 
pushing past break-even, aso-far elusive point 
of operation at which the output power of the 
fusion process exceeds the power invested 
to maintain the plasma. Helped by dozens 
of other labs around the world, ITER, which 
is scheduled to start full operation in 2035, 
will also test several auxiliary technologies 
that a working fusion plant would ultimately 
require, all while separate research into 
competing types of fusion reactors contin- 
ues elsewhere and breakthroughs such as 
deep learning advance the field (J. Kates- 
Harbeck et al. Nature 568, 526-531; 2019). 
With all this in mind, I’m hopeful that working 


nuclear-fusion plants will be built well before 
the end of the century, and that fusion energy 
will help substantially in limiting the impact 
of the climate crisis. 

Irrespective of that crisis, there are plenty 
of other reasons to be excited about nuclear 
fusion. As a physicist, 1am humbled by the 
idea of taming a plasma that is several times 
hotter than the Sun’s core. As a researcher, 
I am amazed by the complexity that a 
nuclear-fusion power plant would require in 
every aspect of its ultimate design. And asa 
writer, | marvel at the prospect of mimicking 
the stars, instead of merely looking up to them. 


“Asa physicist, lam humbled 
by the idea oftaminga 
plasma that is several times 
hotter than the Sun’s core.” 


But it is as a human, thinking of other 
humans, that I feel a breakthrough in 
controlled fusion could rise above all else. 
After all, the human cost of climate change, 
of rising seas and rising temperatures, of 
more frequent droughts and extreme weather 
events, will ultimately have to be paid. And it 
will be paid first and foremost by those who 
have the least; by the poor and the less privi- 
leged, who can be faulted for the crisis they will 
be caught up inno more than a one-year-old 
boy can be faulted for electrocuting himself. 

Nuclear-fusion power plants, more so than 
any other technology, could prove a uniquely 
powerful tool to diminish that cost. 

That’s why hopeto see them in my lifetime. 


Robert Schittko is an experimental 
physicist at Harvard University, Cambridge, 
Massachusetts. 


Reproduction, rethought 


Same-Sex partners should one day be able to raise a 
biological descendant together. By Matthew Zajac 


ne afternoon as a second-year 
undergraduate, I called my parents 
from my dormitory. To them it was 
a routine call home, but to me it was 
a conversation long overdue. I’d 
rehearsed with my closest friends exactly how 
to start; my words needed to strike with confi- 
dence but should mitigate shock. Like protect- 
ing them froma grenade!I’d thrown at them. 

“So... lactually do have some romance in 
my life. With a boy.” 


© 2019 Springer Nature Limited. All rights reserved. 


I practised answers to typical questions 
parents ask after their child comes out as 
gay: “Are you sure?”, “Why haven’t you told 
us?”, “Didn’t you like a girl once?” But those 
questions never came, and 1 wasn’t prepared 
for the one my mom did ask: “What about 
kids?” Whether out of sympathy for my 
aspirations to raise children or because of 
her plans of pampering grandchildren, my 
mother quickly recognized that my ability 
to start a family could be jeopardized by 
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my sexuality. And she wasn’t wrong; 74% of 
American adults are parents, but only 35% of 
lesbian, gay, bisexual and transgender (LGBT) 
adults are parents even though 51% express 
the desire to have children, according toa 
2013 survey. As of 2015, two-thirds of minors 
living with same-sex couples come froma 
previous relationship. But this is changing. 
With homosexuality becoming more accepted 
in parts of the world, people are recognizing 
their sexual identity earlier and might be 
less likely to enter a different-sex marriage. 
As such, fewer same-sex couples are raising 
children, but those children are more likely to 
be born during a same-sex relationship. 

This trend is partly due to increased oppor- 
tunities for same-sex couples to parent, by 
adoption and other means. /n vitro fertiliza- 
tion (IVF) and surrogacy offer partial genetic 
relatedness for same-sex female and male 
couples, respectively. However, neither of 
these options delivers full genetic relatedness. 
Although no evidence suggests that genetic 
relatedness is necessary or sufficient for 
parenthood, surveys of biologically infertile 
different-sex couples showits significance. In 
2017, one study found that more than 97% of 
respondents would prefer having a genetically 
related child (S. Hendriks et al. Hum. Reprod. 
32, 2076-2087; 2017). 

Now, as a graduate student performing 
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research in chemical biology at the University 
of Chicago, Illinois, [think alot about the inter- 
section between my sexuality and my scientific 
interests. Genome-editing techniques are 
currently transforming our capacity to study 
fundamental biology. But, more importantly 
for me, they have offered a glimmer of hope 
that I could one day raise a biological descend- 
ant with my partner. 

The road to same-sex human reproduction 
is one that many think is impossible to trav- 
erse. Aside from ethical and sociopolitical 
roadblocks, there are fundamental biologi- 
cal issues. 

Parthenogenesis, or reproduction from 
an egg cell without fertilization, occurs nat- 
urally in birds and sharks. But mammalian 
reproduction is complicated by genomic 
‘imprinting’, in which some genes are modi- 
fied or shut downin either sperm or eggs while 
their opposite numbers are expressed — like 
the two halves of a zipper coming together. 
Seeking to address this, researchers have 
derived ‘imprint-free’ stem cells. A 2018 
report in Cell Stem Cell described the use of 
CRISPR to delete imprinted regions from 
mouse genomes — removing the teeth from 
the biological zipper (Z.-K. Liet al. Cell Stem 
Cell23, 665-676; 2018). Use of this technique 
with eggs from female mice produced living 
pups that grew to be healthy, fertile adults. 
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However, pups produced using the technique 
with sperm from male mice did not survive 
to adulthood. While a significant achieve- 
ment, many see the low success rate of birth 
(14% with embryos from two mothers, 2.5% 
with embryos from two fathers) as proof that 
mammals are limited to sexual reproduction. 
However, the technique offers optimism that 
same-sex human reproduction may be possi- 
ble witha better understanding of imprinting, 
among other advances. 

The development of same-sex reproduction 
technology might in 2019 be a scientific fan- 
tasy, and its use would be controversial. But IVF 
and same-sex marriage would have been just 
as unthinkable in 1869, when Nature launched 
from a foundation of academic liberalism 
and bold science. The disruptive innovation 
of same-sex reproduction would simply con- 
tinue this endeavour and provide children to 
capable parents, as long as it is investigated 
enough to eliminate risks, made financially 
accessible and regulated responsibly. 

As for me, laspire to give my parents agrand- 
child by any plausible means when my partner 
and I are ready. But to raise a child genetically 
related to meand my partner? That’s adream 
I'll always have. 


Matthew Zajac is a chemical biologist at the 
University of Chicago, Illinois, USA. 
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Neurodegeneration 


Selective clearance of 
mutant huntingtin protein 


Huda Y. Zoghbi 


Compounds have been found that reduce levels of the harmful 
protein present in Huntington’s disease, without affecting the 
normal version. The compounds interact with the mutated 
protein and the cell’s protein-clearance machinery. See p.203 


Several neurodegenerative diseases involve 
the slow accumulation ofa misfolded protein 
in neurons over many years. The proteins 
involved in these diseases might differ, but 
the result is similar — eventually, the neurons 
die from the build-up of toxic misfolded pro- 
teins. Scientists have long been searching for 
ways to reduce the levels of the disease-driving 
proteins without also clearing their wild-type 
counterparts, which typically have myriad cru- 
cial functions. On page 203, Lietal.'show that 
this can be accomplished using compounds 
that interact specifically with both the mis- 
folded part of the protein and the neuron’s 
protein-clearance machinery. 

Li and colleagues chose to focus on 
Huntington’s disease, which is caused by 
an abnormally long stretch of glutamine 
amino-acid residues in the huntingtin (HTT) 
protein. This expanded polyglutamine tract 
causes HTT to misfold. Affected individuals 
typically carry one copy (one allele) of the 
HTT gene that encodes the mutant protein, 
and oneallele that encodes a protein with the 
normal-length glutamine tract. 

Cells are able to degrade the mutant 
huntingtin (mHTT) through autophagy? — a 
clearance mechanism thatinvolvesengulfment 
of proteins by a vesicle called the autophago- 
some. Liet al. hypothesized that compounds 
that bind to both the mutant polyglutamine 
tract and the protein LC3B, which resides inthe 
autophagosome, would lead to engulfment 
and enhanced clearance of mHTT (Fig. 1). But 
no such compounds had been reported. The 
authors therefore conducted small-molecule 
screens to identify candidate compounds, and 
used wild-type HTT inacounter-screento rule 
outcompounds that bind tothenormal version 
of the protein. 

Li and colleagues initially identified 


two candidates, dubbed 1005 and 8F20. 
These compounds had been shown** to 
inhibit, respectively, the activity of the 
cancer-associated protein c-Raf and kinesin 
spindle protein (KSP), which has a key role in 
the cell cycle. The team found that 1005 and 
8F20 were able to clear mHTT independently 
of their effects on these other proteins. 

The researchers showed that the regions 
of 1005 and 8F20 that interacted with mHTT 
and LC3B inthe screen shared structural sim- 
ilarities. Next, they screened for compounds 
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that shared these structural properties but 
were structurally distinct from other c-Raf 
and KSP inhibitors (a compound that acts on 
mHTT without altering these proteins would 
be desirable for clinical treatment). This led 
them to discover two more compounds, ANI 
and AN2, that link mHTT to LC3B and thereby 
selectively reduce levels of mHTT. 

Li and colleagues validated their exciting 
discovery by showing that the four compounds 
reduced levels of the full-length mHTT pro- 
tein (notjust the protein fragment used inthe 
screen). The compounds lowered levels of 
mHTT both in vitro — in mouse neurons and 
neurons derived from the biopsied skin cells of 
people with Huntington's disease — and invivo, 
in mouse and fly models of the disease. 

Akey strength of the compounds identified 
by Li and co-workers is that they leave levels 
of wild-type HTT unchanged. This is crucial 
because HTT has multiple neuronal func- 
tions, both during embryonic development 
and after birth°. Existing mHTT-lowering strat- 
egies typically affect both H7Talleles®’, which 
is not ideal. Equally, the compounds found by 
Li etal. did not affect other proteins that con- 
tain polyglutamine tracts of variable, but not 
disease-causing, length. These proteins often 
have many roles in the brain. 

One question that naturally arises is whether 
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Figure 1| Lowering levels of mutant huntingtin protein. a, Healthy neurons typically carry two copies of 
the gene that encodes the wild-type version of huntingtin protein (wtHTT). Only two proteins are shown, for 
simplicity, although many are produced from each gene copy. b, Huntington’s disease involves the expansion 
of a tract of glutamine amino-acid residues in one copy of HTT protein, producing a mutant version (mHTT) 
that accumulates in neurons, causing them to shrink and eventually die. Any strategy to decrease levels of 
mHTT in these cells must not affect wtHTT, which has key functions in the brain. c, Liet al.’ have identified 
four linker compounds that fulfil this role. Treatment with these compounds inhibits neuronal degeneration 
in various models of Huntington’s disease. The compounds bind to both mHTT and the protein LC3B — a key 
component ofa protein-clearance pathway called autophagy. This enables selective engulfment of mHTT bya 
vesicle called the autophagosome, leading to the mutant protein’s degradation. 
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treating cells with the compounds led to 
enhanced autophagic clearance of proteins 
other than mHTT. Li et al. assessed the levels 
of the repertoire of proteins in the cortices of 
mice that carried an mHtt allele. They found 
changes in the abundance of a small percent- 
age of proteins in mice treated with the com- 
pounds, compared with untreated animals. 
What remains unclear is whether the levels of 
some proteins decreased because mHTT lev- 
els were diminished, or because of autophagy. 
Modest changes in protein-expression level (in 
the 20-30% range for some wild-type proteins) 
cancauseneurological deficits®, so pinpointing 
any off-target effects of the compounds willbe 
a crucial next step. Even effects that initially 
seem inconsequential might build up over 
the course of long-term therapy, becoming as 
problematic decades later as the original toxic 
protein. 

Despite these concerns, the authors found 
encouraging evidence that the compounds 
could produce functional improvements in 
models of Huntington’s disease across three 
species. First, patient-derived neurons treated 
with each of the compounds showed signifi- 
cantly less shrinkage, degeneration of neuronal 
projections and cell death than was seen in 
untreated neurons. Second, flies that model 
Huntington's disease and were treated with the 
compounds recovered climbing ability and sur- 
vived longer than did untreated counterparts. 
Third, treated mice that model Huntington’s 
disease showed improvements in three motor 
tests, compared with untreated mice. That said, 
preclinical trials in mice will be necessary to 
ascertain that the benefit is sustained and 
robust over the course of long-term therapy. 

Finally, Li et al. analysed mutant ataxin-3, a 
protein that is involved ina neurodegenerative 
disorder called spino-cerebellar ataxia type 3. 
The researchers found that the compounds 
targeted the long polyglutamine tract of 
mutant ataxin-3 and lowered protein levels. We 
already knowthat small reductions inthe levels 
of mutant ataxin-1, ataxin-2 and ataxin-3 can 
reduce the severity of spino-cerebellar ataxia 
types 1, 2 and 3, respectively, in mouse mod- 
els’. Thus, this therapeutic strategy might 
be useful not only for Huntington’s disease, 
but also for other diseases involving expanded 
polyglutamine tracts. 

Moving forwards, there are three major 
research paths to pursue. The first involves 
establishing the mechanism by which Li and 
colleagues’ compounds recognize proteins 
with expanded polyglutamine tracts but spare 
normal proteins. Perhaps the compounds rec- 
ognize a particular structural conformation 
that arises only after the polyglutamine tract 
exceeds a specific length. The second involves 
testing the compounds in other models of 
polyglutamine disorders and assessing their 
effects. 

The third path involves conducting similar 
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small-molecule screens for compounds that 
can clear polyglutamine proteins using other 
types of protein-clearance machinery. For 
instance, small molecules dubbed proteolysis- 
targeting chimaeras (PROTACS) link a ubiqui- 
tin ligase enzyme to a protein of interest. The 
enzyme tags the protein with ubiquitin groups, 
leading to the protein’s degradation by acellu- 
lar machine called the proteasome”. PROTACS 
have yet to be applied to a polyglutamine- 
expanded protein. But given that some of these 
proteins are degraded by the proteasome, the 
strategy could well prove viable — as long as 
it targets only the abnormally long polyglu- 
tamine tract. 


Huda Y. Zoghbi is in the Departments of 
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Neurology, and Neuroscience, Baylor College 
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Soft microbots controlled 
by nanomagnets 


Xuanhe Zhao & Yoonho Kim 


Arrays of nanoscale magnets have been constructed to form 
the magnetized panels of microscopic robots — thus allowing 
magnetic fields to be used to control the robots’ shape and 


movement. See p.164 


In science-fiction films, robots are often 
depicted as human-sized or larger machines 
made of rigid materials. However, robots made 
of soft materials or with flexible structures, 
and that can be much smaller than the human 
body, have attracted great interest inthe past 
few years because they have the potential to 
interact with humans more safely than can 
rigid machines. Indeed, sufficiently small 
soft robots could even be used for biomedi- 
cal applications in the human body. Various 
options are available to power these robots, 
but magnetic fields offer a safe and effective 
means of wireless operationin confined spaces 
in the body. On page 164, Cui et al.!reporta 
key step towards the fabrication of micro- 
metre-scale robots that, ina programmable 
manner, can quickly morph into different 
shapes in applied magnetic fields. 

The ability of minerals known as lodestones 
to align with Earth’s magnetic field was first 
reported in the ancient Chinese manuscripts 
Gui Gu Zi and Han Fei Zi, and was later used 
in early magnetic compasses’. A similar prin- 
ciple has been used in the past few years in 
magnetic soft robots®, in which magnets 
of varying sizes (nanometres to millimetres) 
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are integrated into flexible structures or soft 
materials. The tendency of the magnets to 
orient in externally applied magnetic fields 
provides a way of quickly moving or changing 
the shape of these untethered robots remotely. 
This actuation mechanism allows much flexi- 
bility in the design of the robots’ structures, 
magnetization patterns and strengths, andin 
when and where magnetic fields are applied 
to control the robots. In addition, because 
the forces and torques exerted on magnets 
by external magnetic fields can be accurately 
calculated, models have been developed 
to quantitatively describe the actuation of 
specific robot designs”. 

Magnetic soft robots have been developed 
for various uses, especially in biomedical 
applications in which they interact closely 
with the human body. For example, self-fold- 
ing ‘origami’ robots have been reported that 
can crawl through the gut, patch wounds 
and dislodge swallowed objects‘; and cap- 
sule-shaped robots have been made that roll 
along the inner surface of the stomach and can 
perform biopsies and deliver medicine’. Mag- 
netically steerable robotic catheters have also 
been developed, which can perform minimally 
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Figure 1 | Magnetic soft microbots morph on cue. a, Cui et al.'have fabricated microscopic components 
consisting of magnetized panels connected by flexible hinges. When an external out-of-plane magnetic field 
is applied, the panels move ina direction that depends on the panels’ direction of magnetization (red arrows) 
and on the direction of the applied field. For example, this two-panel system bends at the hinge. b, Robots 
assembled from panels that have different magnetization directions can thus be made to undergo complex 
movements when a sequence of magnetic fields is applied, such as this bird producing flapping movements. 


invasive surgery on the heart or inspect lung 
airways>’”. And much thinner, thread-like 
robots have been made that could potentially 
navigate the brain’s blood vessels to treat 
strokes or aneurysms”. These robots range 
insize from hundreds of micrometres to a few 
centimetres in diameter. 

Further miniaturization of magnetic soft 
robots could enable new applications, such as 
performing operations in the smallest blood 
vessels and manipulating single cells, but the 
fabrication of such tiny machines poses a con- 
siderable challenge. Existing methods forthe 
construction of small magnetic soft robots 
have included the direct assembly of magnetic 
components? >”, the magnetization of parti- 
cle-loaded polymer sheets*®, and the printing of 
soft composite materials that contain aligned 
magnetic particles”””. Cuiand colleagues now 
push the technological boundaries further, by 
using a technique called electron-beam litho- 
graphy to make magnetically reconfigurable 
robots at scales of just a few micrometres. 
More specifically, this technique enables them 
to prepare arrays of nanoscale cobalt magnets 
in panels ona thin, flexible substrate of silicon 
nitride (Si,N,). 

The authors’ cobalt nanomagnets can 
retain their magnetism after exposure to an 
external magnetic field. This behaviour is 
called hysteresis, and results, in part, from 
the nanomagnets’ shape. The authors could 
therefore tune the nanomagnets’ magnetic 
properties and hysteretic behaviour so that 
thinner nanomagnets were harder to magnet- 
ize than thicker ones; in other words, stronger 
magnetic fields were required to magnetize 
thinner nanomagnets. This, inturn, meant that 
it was easier to re-magnetize thicker magnets 
—to‘over-write’ the strength and direction of 


their magnetization — using relatively weak 
fields. 

Cui and colleagues could therefore 
selectively tune the magnetization of the 
nanomagnets so that an actuating magnetic 
field (much weaker than the fields that ini- 
tially magnetized them) caused different 
panels to fold in different ways. The result- 
ing multi-panelled components were thus 
‘programmed’ to morph into specific con- 
figurations in an actuating magnetic field 
(Fig. 1). These components could, in turn, be 
assembled to produce complex shapes, such 
as letters, and even to make a microscopic 
‘bird’ that produces motions such as turning, 
flapping and slipping across a surface. 


“The authors madea 
microscopic ‘bird’ that flaps, 
turns and slips across a 
surface.” 


Much work must still be done to achieve 
the full potential of magnetic soft robots for 
biomedical applications across various length 
scales. They must be designed using quanti- 
tative models to optimize their performance 
for specific tasks in relatively weak magnetic 
fields — that is, to work out which reconfigu- 
rations are needed, the sizes of the forces that 
the robot must exert on its environment, and 
the speeds at which reconfigurations should 
occur and with which the forces should be 
applied. Advanced fabrication platforms, 
suchas the oneused by Cuietal., will be crucial 
for implementing future designs. 

Methods for the real-time imaging and 


© 2019 Springer Nature Limited. All rights reserved. 


localization of robots deep in the human body 
are also needed, particularly in tight spaces, 
and must not interfere with the magnetic-ac- 
tuation mechanisms. Artificial intelligence 
might be further developed to assist image 
analysis and robot control. Lastly, methods are 
needed for the safe retrieval or degradation of 
robots once they have performed their tasks. 
Degradation without toxicity or other adverse 
effects is particularly desirable. 

Magnetic soft robots are also being exten- 
sively studied for applications beyond 
biomedicine’, such as in flexible electronics, 
reconfigurable surfaces and active meta- 
materials (engineered materials consisting 
of subunits that take in energy locally, and 
then translate it into movement that can pro- 
duce large-scale dynamic motion). A parallel 
set of platforms for the design, fabrication, 
imaging and control of magnetic soft robots 
across various length scales are therefore 
under development. That work, together 
with developments such as those of Cui and 
colleagues, is laying the foundation for this 
nascent field. 
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News & views 


Cancer genetics 


Genomes captured during 


tumour spread 


Jillian F. Wise & Michael S. Lawrence 


A better understanding of the genetic changes that enable 
cancers to spread is crucial. Acomprehensive study of 
whole-genome sequences from metastatic cancer will help 
researchers to achieve this goal. See p.210 


The major cause of cancer-related deaths is 
the spread of cancer cells from their primary 
site to other parts of the body’. This spread- 
ing process, known as metastasis, typically 
involves cellular stressors and environmen- 
tal shocks that induce dramatic changes 
in cancer cells. One such change is a fierce 
resistance to current therapies, which means 
that new ways to combat metastatic disease 
are urgently needed. On page 210, Priestley 
etal.” use whole-genome sequencing (WGS) 
to illuminate the genomic changes that under- 
pin metastasis in 22 types of solid tumour. 
Although previous studies** have unearthed 
some hints of such changes, this is perhaps the 
first pan-cancer metastasis study ofits size to 
exploit the power of WGS. 

Priestley et al. characterized 2,520 sam- 
ples of metastatic tumours from people with 


cancer (Fig. 1).In each case, they also analysed 
asample of non-cancerous blood cells from 
the same person. Using WGS, the authors pro- 
duced a rich catalogue of the genetic muta- 
tions found in each metastasis. This catalogue 
complements existing inventories from both 
metastasis-sequencing studies and genomic 
databases of primary tumours, and offers 
several interesting insights. For example, the 
authors reveal frequent mutations in the gene 
MLK4; this is consistent witha previous study 
that connected an increased number of copies 
of MLK4 with metastasis’. 

Most of the authors’ findings confirm 
previous work on metastatic cancers**. For 
instance, other studies did not find recurrent 
cancer-causing mutations that were spe- 
cific to metastatic tumours (that is, absent 
in the primary tumour) and that thus might 
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Figure 1| Characteristics common across metastatic cancers. Cells ina primary tumour typically harbour 
cancer-causing mutations (oncogenes). As the cancer evolves, it acquires further mutations that enable it to 
spread to other sites in the body through the blood — a process called metastasis. Priestley et al.? sequenced 
the entire genomes of 2,520 metastatic tumours, across 22 cancer types. They find frequent mutations in the 
gene MLK4. They also report widespread structural variations, such as whole-genome doubling (which they 
find to be especially common) and deletions of large chromosomal regions. 
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have triggered metastasis. This has led to 
speculation that, at least in solid tumours, 
metastasis-specific mutations are not the 
major cause of cancer spread’. Priestley et al. 
also found limited evidence of such mutations. 

The researchers analysed not only 
single-nucleotide (point) mutations, but 
also large structural variations, including the 
deletion of DNA sequences and transloca- 
tions of DNA from one chromosomal region 
to another. Structural variations are difficult to 
detect using sequencing techniques that cover 
small portions of the genome — sequencing 
of only protein-coding regions, for instance, 
or of even smaller targeted sequences. These 
techniques are used more frequently than WGS 
inclinical studies because of their affordability. 
Documentation of large structural variants is 
therefore a valuable feature of Priestley and 
colleagues’ WGS study. 

In particular, the report reveals pervasive 
whole-genome doubling (WGD), in which 
the entire chromosome inventory is copied. 
Priestley et al. find WGD in up to 80% of cases 
incertain types of metastatic cancer, whereas 
the phenomenon has been reported in only 
about 30% of primary tumours®. Linked to 
chromosomal instability, WGD can confer 
multidrug resistance to chemotherapy. 
Furthermore, it might provide a buffer for 
cancer cells against the detrimental effects 
on fitness caused by genomic instability, 
such as damaging mutations and losses of 
chromosomal segments’. 

Although Priestley and colleagues present 
alandmark study, future efforts could benefit 
from researchers also sequencing each 
person’s primary tumour. Doing this would 
have allowed Priestley et al. to generate a 
detailed reconstruction of how each cancer’s 
genome evolved along the route to metas- 
tasis. To compensate for this limitation, the 
authors leveraged a large WGS study of primary 
tumours (the International Cancer Genome 
Consortium’s pan-cancer analysis of whole 
genomes’). The researchers compared point 
mutations and small insertions and deletions 
between thetwostudies. These analyses largely 
confirmed a previous report of high genomic 
concordance between primary and metastatic 
tumours’. However, the comparison also 
revealed that the ten most commonly mutated 
cancer-causing genes in primary tumours are 
even more frequently mutated in metastatic 
tumours. Furthermore, larger DNA aberrations 
such as structural variations and WGD are 
significantly more common in metastases in 
most cancer types. These findings indicate that 
ahallmark of metastatic progressionis ongoing 
and accelerating genomic instability. 

Another caveat concerning this study, 
acknowledged by the authors, involves the 
use of fine-needle biopsies as the major 
sample-collection method. These biopsies 
gather cells from only a tiny subregion of a 


metastatic site. The authors report that, on 
average, more than about 93% of mutations 
detected in a given sample were present 
in every cell of that sample. This is in stark 
contrast to previous studies”, which have 
reported much higher levels of variation. The 
extreme homogeneity observed by Priestley 
et al. could, in principle, reflect the fact that 
only a few founding cancer cells colonized 
each metastasis, but might instead reflect 
the limited regional sampling achieved by the 
fine-needle biopsy method. 

Future clinical studies of metastasis are 
likely to consider liquid biopsies as an alter- 
native collection method. Liquid biopsies 
involve collecting samples of a person’s 
blood and applying specialized laboratory 
techniques to isolate cancer-derived com- 
ponents, such as circulating tumour cells, 
circulating tumour DNA and released sub- 
cellular vesicles. This approach is less inva- 
sive than fine-needle or surgical biopsies. It 
also offers other advantages, including the 
ability to collect cells simultaneously from all 
metastatic cancer sites in the body (instead of 
just one), and to repeat sampling at multiple 
times during treatment, thereby providing 
dynamictemporal information about a cancer 
and its response to therapy. Liquid biopsies 
also enable researchers to document meta- 
static evolution at the DNA, RNA and protein 
levels in parallel™'”, 

Ultimately, the true value of any research 
comes from improvements to treatment. To 
maximize the potential for clinical impact, 
Priestley and colleagues’ data set is open-ac- 
cess. The authors have already accumulated 
more than 80 collaborative requests to inves- 
tigate topics ranging from the possible pres- 
ence of viral genetic material in the samples 
to the relationship between the sequences 
and patient drug responses (go.nature. 
com/2ommmn2). The data set is also being 
used to investigate whether any mutational 
variants involved in driving metastasis lie in 
regulatory DNA regions, and to enable efforts 
to deduce the anatomical origin of metastatic 
cancers diagnosed without a known prima- 
ry-tumour site. Indeed, itis already powering 
exploration of these questions. The publicly 
available repositories are also being used ina 
Drug Rediscovery protocol”, in which patients 
with metastases who have exhausted standard 
therapies are matched with promising off-label 
treatments (anticancer medicinesthat havenot 
been specifically approved for use against the 
person’s type of cancer) onthe basis of results 
from WGS. 

Obtaining metastatic biopsies is not without 
risks to the patient, such as bleeding and infec- 
tion. This is partly why sample collection has 
beenso limited until now. Those who donated 
samples to this study have provided research- 
ers with a valuable gift. It is hoped that the 
database will, inturn, provide the new insights 


and therapeutic strategies that are so urgently 
needed. 
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Progress onthe 


proton-radius puzzle 


Jean-Philippe Karr & Dominique Marchand 


Atomic physicists and nuclear physicists have each made a 
refined measurement of the radius of the proton. Both values 
agree with a hotly debated result obtained by spectroscopy of 
an exotic form of hydrogen called muonic hydrogen. See p.147 


The proton, discovered 100 years ago’, is an 
essential building block of visible matter. The 
nucleus of ahydrogen atom consists ofa single 
proton, making this atoma suitable platform 
for determining the proton’s intrinsic proper- 
ties. One such property is the proton charge 
radius, which corresponds to the spatial extent 
of the distribution of the proton’s charge. In 
2010, a highly accurate measurement of the 
proton radius was made using spectroscopy 
of muonic hydrogen — an exotic form of 
hydrogen in which the electron is replaced 
by aheavier version called a muon’. However, 
the value obtained was almost 4% smaller 
than the previously accepted one’. Bezginov 
et al.*, writing in Science, and Xiong et al.5, 
on page 147, report experiments that could 
represent a decisive step towards solving this 
proton-radius puzzle. 

Atomic physicists determine the proton 
radius by measuring the energy difference 
between two electronic states of a hydrogen 
atom using spectroscopy. According to quan- 
tum mechanics, there is a non-zero probabil- 
ity that the electron will be found inside the 
protonifthe electronis ina rotationless state 
(an S state). When inside, the electron is less 
strongly influenced by the proton’s electric 
charge than it would otherwise be. This effect 
slightly weakens the binding of the electron 
and proton, and causes a tiny shift in the 
energy of the S state with respect to other 
states. The high precision achieved both by 
experiments and by the theory of quantum 
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electrodynamics allows this energy shift and, 
inturn, the proton radius, to be extracted from 
measurements. 

A muon is about 200 times heavier than an 
electron. As a result, there is a much higher 
probability that the muon ina muonic-hydro- 
gen atom will be found inside the protonthan 
would the electron in an ordinary hydrogen 
atom. Consequently, the associated energy 
shift is about 8 million (200°) times larger 
for muonic hydrogen than for regular hydro- 
gen®. Muonic hydrogen is therefore a highly 
sensitive probe of the proton radius. 

Bezginov and colleagues’ work concerns the 
Lamb shift of ordinary hydrogen — the energy 
difference between the 2S and 2P excited 
states. This shift was investigated previously 
in muonic hydrogen””. To measure the Lamb 
shift, the authors developed an experimental 
method that derives froma technique known 
as Ramsey interferometry, which is used in 
atomic clocks. 

This experimental method has many 
technical advantages over other approaches 
with regard to eliminating systematic uncer- 
tainties, filtering environmental noise, and 
simplicity in the shape of the spectral signal. 
A key feature of the set-up is the ability to 
measure a full spectrum in only a few hours. 
This allowed Bezginov et al. to carry out a 
meticulous study of systematic uncertain- 
ties and to extract a precise value for the 
proton radius: 0.833 + 0.010 femtometres 
(1fmis 10-5 metres). 
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Figure 1| Values for the proton radius. A key property of the proton is its charge radius — the spatial extent 
of its charge distribution. This quantity is expressed in femtometres (1 fm is 10°’ metres). The data points 
are values for the proton radius obtained over the past decade, including the latest results, from Bezginov 
et al.* and Xiong et al.°, with uncertainties indicated by the error bars. The data were obtained using three 


different measurement techniques: electron-proton scattering 


5,10 4,9,13 


, spectroscopy of ordinary hydrogen 


and spectroscopy of an exotic type of hydrogen called muonic hydrogen’”. The error bars for the two data 
points associated with muonic-hydrogen spectroscopy are too small to be depicted in this figure. The bands 
denote the values adopted by the Committee on Data for Science and Technology (CODATA) in 2014 (ref. 11) 


and in 2018 (see go.nature.com/2bwkrqz). 


Nuclear physicists measure the proton 
radius using the ‘elastic’ scattering of elec- 
trons from protons. In this interaction, the 
incident electron transfers energy to the 
targeted proton through the exchange of a 
virtual (transient) photon. Ina similar way to 
microscopy, short-wavelength photons (which 
transfer alot of energy) reveal details at small 
scales. To determine the full extent of the 
proton’s charge distribution, one should, in 
principle, use photons of infinite wavelength 
(that transfer zero energy), but no scattering at 
all would occur in this situation. Experiments 
therefore aim to achieve the lowest-possible 
energy transfer and then to extrapolate down 
to zero. This extrapolation, which relies on 
a parameterization of experimental data, 
is one of the main challenges in precisely 
determining the proton radius. 

Xiong and colleagues implemented several 
key improvements over previous studies in 
their experiment, the Proton Radius experi- 
ment at Jefferson Laboratory in Virginia. 
Crucially, this investigation explores 
extremely low energy transfers (ten times 
closer to zero than previous data) while also 
probing larger energy transfers, to ensure 
consistency with existing data. The scattered 
electrons were detected through their energy 
loss ina detector called an electromagnetic 
calorimeter. This set-up avoided the need to 
use a magnetic spectrometer, the multiple 
settings of which induce systematic errors. 

Furthermore, rather than making absolute 
measurements, Xiong et al. advantageously 
relied on relative measurements. Specifically, 
they determined the ratio between the number 
of events corresponding to elastic electron- 
proton scattering and the number related to 
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Moller scattering — a well-understood and 
calculable quantum-electrodynamics process 
in which electrons are scattered from atomic 
electrons. This strategy led to the cancellation 
of many systematic effects that are associated 
with absolute measurements. 

In addition, the protons were ina hydrogen 
gas that was kept inside a chamber that 
did not have entrance and exit windows 
as used in previous similar experiments. 


“These independent 
measurements tip the scales 
in favour ofa small proton 
radius.” 


This arrangement avoided background noise 
that would have been produced by the inter- 
action of particles with window materials. 
Overall, Xiong and colleagues’ chosen set-up, 
careful systematic-uncertainty checks at each 
step and exhaustive study of several para- 
meterizations to extrapolate the data to zero 
energy transfer lend support to their value for 
the proton radius: 0.831 + 0.014 fm. 

The independent measurements of the 
proton radius made by Bezginov et al. and 
Xiong etal. are precise and consistent (Fig. 1). 
They tip the scales in favour ofa small proton 
radius, inagreement with the highly accurate 
results from muonic-hydrogen experiments”. 

But to conclusively solve the proton-radius 
puzzle, one still needs to understand why 
there are discrepancies between the latest 
results and the data from previous hydro- 
gen-spectroscopy’ and electron-proton 
scattering’ experiments. For instance, the 
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value of the proton radius" adopted by the 
Committee on Data for Science and Technol- 
ogy in 2014 was 0.8751 + 0.0061 fm. Because 
no convincing explanation for these dis- 
crepancies has been proposed, worldwide 
efforts must be pursued to validate the latest 
results and to critically assess the different 
measurement techniques. 

Next-generation experiments will provide 
innovative approaches to this task. For exam- 
ple, the Muon Scattering Experiment” at 
the Paul Scherrer Institute in Switzerland is 
simultaneously investigating muon-proton 
and electron-proton scattering. This experi- 
ment is testing for possible differences in 
the behaviour of electrons and muons — an 
observation that would imply the existence 
of physics beyond that of the standard model 
of particle physics. On the spectroscopy 
side, high-precision measurements will be 
extended to other nuclei such as helium, and 
to molecules. It is highly probable that the 
harvest of results from future experiments 
will not only definitely solve, but might also 
explain, the proton-radius puzzle. 
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s illustrated by the 17 UN Sustainable Development Goals 
(https://www.un.org/sustainabledevelopment/), the chal- 
lenges facing us are multifaceted and broad. The solutions 
will not lie in any one sector—all of society must engage. But 
science undoubtedly has a part to play in tackling pieces of the 
overarching puzzle of how to live inan equitable and healthy way that 
accords with the constraints of our planet and leaves no one behind. 

In this collection of Reviews and Perspectives drawn together to 
celebrate 150 years of Nature, we take a look at a snapshot of areas in 
which science might help ease the transition to a healthy, sustainable 
and inclusive future. 

In no way exhaustive, topics covered include the reuse and recy- 
cling of batteries from electric vehicles, the design and manufacture 
of more sustainable structural alloys, the capture and utilization of 
carbon in products, the resilience of harvestable ecosystems, the 
engineering of climate-resilient crops, progress and challenges in global 
vaccination, and epidemic prevention and response. Emphasised 
throughout is the need for a whole-of-society response if effective 
solutions are to emerge. 

Science willserve us best ifit accords with the principles of inclusivity, 
transparency and openness. In this vein, we end this Insight witha piece 
that shows howattending to one quite glaring blind spot inthe scientific 
method-—the overlooking of sex and gender—can open up newlines of 
enquiry and render research findings applicable to all. 

There can be no division between science and society—even the 
purest and most abstracted lines of enquiry emanate froma social fabric 
and time, withits characteristic attitudes, limitations, blind spots and 
needs. Registering those restrictions and needs, and working to address 
them inaconstructive and collaborative way, will help to keep the vision 
ofamore humane and environmentally sound future, as laid out by the 
UN Sustainable Development Goals, alive. 
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Metallic materials have enabled technological progress over thousands of years. The 
accelerated demand for structural (that is, load-bearing) alloys in key sectors such as 


energy, construction, safety and transportation is resulting in predicted production 
growth rates of up to 200 per cent until 2050. Yet most of these materials require alot 
of energy when extracted and manufactured and these processes emit large amounts 
of greenhouse gases and pollution. Here we review methods of improving the direct 
sustainability of structural metals, in areas including reduced-carbon-dioxide 
primary production, recycling, scrap-compatible alloy design, contaminant tolerance 
of alloys and improved alloy longevity. We discuss the effectiveness and technological 
readiness of individual measures and also show how novel structural materials enable 
improved energy efficiency through their reduced mass, higher thermal stability and 
better mechanical properties than currently available alloys. 
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Structural metallic materials have a historic and enduring importance 
inour society. They have paved the path of human civilization with load- 
bearing applications that can be used under the harshest environmental 
conditions, from the Bronze Age onwards. Only metallic materials 
encompass such diverse features as strength, hardness, workability, 
damage tolerance, joinability, ductility and toughness, often combined 
with functional properties such as corrosion resistance, thermal and 
electric conductivity and magnetism. This versatility comes with a vast 
understanding of thermo-mechanical processing of metals (accrued 
over millennia of metals use), whichin turn enable numerous produc- 
tion, manufacturing, design, repair and recycling pathways. 


Benefit and environmental impact of metallic alloys 

Metals have enabled multiple applications in the fields of energy con- 
version, transportation, construction, communication, health, safety 
and infrastructure. Examples over the millennia have been agricultural 
tools, manufacturing machinery, energy conversion engines and rein- 
forcements in huge concrete-based infrastructures. Recent applica- 
tions include structural alloys for weight reduction combined with 
high strength and toughness in the transportation sector", efficient 
turbines operating at higher temperatures for power plants and air 
traffic>®, components for safe nuclear and fusion power and disposal’, 
targeted endurance or corrosive dissolution of biomedical implants°, 
embrittlement-resistant infrastructures for hydrogen-based industries” 


or reusable spacecraft”. Metallurgical alloys and products boost inno- 
vation and economic growth: the global market for metals is about 
3,000 billion euros per year"'”. 

The success of the structural metals industry also means that it 
has an undisputable role in addressing our environmental crisis. The 
availability of metals (most of the elements used in structural alloys 
are among the most abundant), efficient mass producibility, low price 
and amenability to large-scale industrial production (from extrac- 
tion to the metal alloy) and manufacturing (downstream operations 
after solidification) have become a substantial environmental burden: 
worldwide production of metals leads to a total energy consump- 
tion of about 53 exajoules (10° J) (8% of the global energy used) and 
almost 30% of industrial CO,-equivalent emissions (4.4 gigatons of 
carbon dioxide equivalent, Gt CO,eq) when counting only steels and 
aluminium alloys (the largest fraction of metal use by volume)”; see 
Table 1. Although the production volumes of nickel and titanium are 
much smaller, they have an eminent role in aerospace and biomedical 
materials and nickel is primarily used as an alloying element in stain- 
less steel (accounting for two-thirds of nickel’s uses). The worldwide 
annual production in terms of mass, energy and CO, is presented in 
Table 1, with metal lost in manufacturing for these four key structural 
metals (where nickel use in stainless steel is the focus). 

Mining and production of these materials have a huge impact in terms 
of resource use, emissions and waste generation, and this impact con- 
tinues to grow, because of trends around urbanization, electrification 
and digitization (in 1950 less than 30% of the population lived in cities 
but this number is projected to exceed 60% by 2025). In addition, there 
are substantial byproducts of both industries that cause considerable 
environmental damage when not managed properly in perpetuity 
(losses throughout the supply chain are shown in Fig. 1 along with the 
quantities of material recovered as scrap). The two largest material 
groups (steel and aluminium) alone create huge mining and extrac- 
tion byproducts, namely, 2,400 million tons (Mt) per year of tailings 
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Table 1| Overview of the energy and environmental impacts of key structural metals 


Worldwide annual production (Mt yr") 


Energy (EJyr') CO,(CO,eqyr") Material scrapped in manufacturing 


Steel 1,700 (of which 45% is based on scrap input) 40 3.7 Gt 25% 
Al 94 (of which 30% is based on scrap input) 13 0.7Gt 40% 
Ni (stainless steels/superalloys) 2.1 (of which 25% is based on scrap input) 0.25 26 Mt 20% 
Ti 0.2 (limited post-consumer scrap) 0.07 6.7 Mt 60% 


and 220 Mt per year of slags for steels and 160 Mt per year of bauxite 
residue for the case of aluminium. 

Accidents, suchas the iron-ore mining dam collapse in the mineral- 
rich state of Minas Gerais, Brazil, in 2019 or the Ajka, Hungary, spill in 
2010 where 100,000 cubic metres of red mud breached adam, show 
that these byproducts provide a constant threat and risk associated 
with extraction of the precursors to structural alloys. These energy- 
consumption challenges and detrimental environmental impacts are 
the biggest obstacle for further use of structural metals (Table 1 and 
Fig. 1). 

To outline the critical opportunities towards more sustainable struc- 
tural metals, this Review describes several approaches and measures. 
We discuss direct sustainability effects for different steps along the 
value chain including CO,-reduced primary production, secondary 
production through recycling and more efficient manufacturing (see 
‘Direct sustainability measures’ and ‘From geo-mining to urban mining’ 
sections). In this context, we also discuss opportunities to make alloy 
design more recycling-oriented upfront (see ‘Sustainable alloy design 
and recycling-friendly materials’ section). Another strategy focuses 
on improved alloy longevity through corrosion protection, damage 
tolerance and repairability for longer product use (see ‘Longevity by 
corrosion protection, lifetime extension and re-use’ section). Finally, 
we illustrate how structural metals also enable energy-efficient prod- 
ucts and processes indirectly through improved energy conversion 
and weight reduction in transportation at present or higher safety and 
lower costs (see ‘Higher energy efficiency through lightweighting and 
harsher operating conditions’ section). 

Frequently, addressing environmental impact involves tradeoffs 
and undesired consequences (such as vehicles not becoming more 
fuel-efficient despite materials innovation and the use of lightweight 
alloys because extra performance and luxury items are added). Toensure 
that we are improving sustainability outcomes, one must consider the 
environmental impact of these strategies quantitatively to avoid unin- 
tended consequences, wherein we improve one aspect at the detri- 
ment of another part of the material system or life cycle. Furthermore, 
one must consider the economic feasibility, the technology readiness 
of potential solutions, and the role of governmental legislation. Such 
legislation could mean that metal production will be limited by actions 
to limit greenhouse gas emissions. The scope of our discussion here 
primarily focuses onthe technical aspects of proposed solutions, but 
we indicate the feasibility and viability of these opportunities, that is, 
we assess the effects that the different measures can have onenhancing 
the sustainability of structural alloys. 

Figure 2 presents these critical opportunities along two axes, their 
scaled potential for impact versus their technology readiness. In Fig. 2 
we qualitatively rank the impact of each of the strategies, in addition to 
howsoon the impact might occur, based on the status of the technology 
(and societal willingness to adopt the approach). Each strategy is also 
differentiated by the metal industry where it may have the most impact 
(or not, if the potential impact is for all alloys). For the lower-volume 
alloys, containing mainly titanium and nickel, the qualitative impact 
potential is scaled by the size of the industry. For example, reduced 
scrap in manufacturing holds considerable potential for sustainability 
improvements within titanium-related value chains (green in Fig. 2) 
even though reductions in manufacturing scrap would have a higher 


impact overall for steel, given its larger production volumes. Irrespec- 
tive of the qualitative nature of Fig. 2 and the subjective placement of 
each strategy, we offer this as a framework within which to understand 
the relative potential of each option. 


Direct sustainability measures 

Improving the direct sustainability of structural alloys refers to reduc- 
ing the environmental footprint of production and manufacturing. 
Ideally, moving towards more sustainable materials can be coupled 
with improvementin the material’s performance and longevity. In this 
section we focus on the opportunities within production first, and then 
within manufacturing. 

Asshownin Fig. 1, large fractions of metal still flowinto societal stock 
(the infrastructure and products that we use), so efforts must focus 
first on production, where there is the most potential for reduction 
of carbon dioxide emissions. This means that recycling alone cannot 
address production efficiency because the world’s demand for metal- 
lic alloys exceeds the available amount of scrap by about one-third, at 
least up to 2050”? *. Improvements in production will vary by metal, 
except that all production processes would benefit now from use of 
non-carbon-intensive energy sources and better byproduct manage- 
ment (particularly wastes from steel and aluminium), as we suggest in 
Fig. 2. These strategies are more technology-ready than CO,-reduced 
approaches. Another important approach across production of all met- 
als (also shown in Fig. 2) lies in harvesting the enormous waste heat from 
metal production, which could be used for electricity production. We 
focus primarily on steel in our discussion on CO,-reduced production, 
with only a brief mention of aluminium, given that steel is where most 
of the opportunity exists (Fig. 2). 

For low- and medium-alloyed carbon steels, coke-making, blast fur- 
nace operation and steel-making account for the largest fraction of 
CO, emissions (2 billion tons) along the metal value chain, producing 
about 5.5% of the world’s total CO, emissions from all fossil fuel burn- 
ing’. A process with greatly reduced process emissions can be realized 
through electrolytic reduction of iron oxide in an alkaline solution at 
110 °C and subsequent processing in electric arc furnaces, realizing a 
fully electrified synthesis route. This approachis a disruptive alternative 
to carbon reduction in the blast furnace because it uses only electrical 
rather than fossil fuel as the energy carrier (this electricity must then 
come from renewable sources)"®. Up until now, wider use of this process 
has been impeded by the high costs and the aggressive conditions to 
which the electrodes are exposed. Technology readiness studies assume 
that electrolytic iron synthesis, which has not yet reached pilot-plant 
scale, is unlikely to enter the market before 2040”. 

There are also steps that can be taken to render blast furnace and 
converter production more sustainable. A reduction in CO, emissions 
of up to 30% can be reached through (1) the addition of hydrogen to 
fossil reducing agents (such as coal) in the blast furnace, which also 
increases efficiency and production rate as a result of hydrogen’s high 
percolation rate”””; and (2) CO, capture and the downstream catalytic 
reduction of the waste CO, into alternative chemical products and/or 
energy carriers”. These techniques are available and are currently 
being studied at pilot-plant scale. Whereas CO, capture is ready to enter 
the market (depending on investment, carbon taxation and political 
decision-making to sanction this technique), the use of hydrogen in 
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Fig.1| Global material flows along the supply chain. Data from 2017” are 


shown for ironand steel! (a), aluminium?” (b), nickel** (c) and titanium*® (d). 
The width of the flows corresponds to mass at each stage including metal loss 


existing blast furnaces and downstream chemical conversion of CO, 
into value-added chemical products are less mature technologies, the 
first one owing to safety and infrastructure issues and the second one 
owing to the impurity of the exhaust gas mixture, which renders the 
required catalysis processes challenging. 

An alternative to the blast furnace is the solid-state reduction of 
ores. In these direct reduction methods, porous iron-oxide fillings 
are reduced into pelletized aggregates with >95% Fe content without 
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Ferrotitanium 


(not to scale). The inset charts show dominant end uses for these materials. 
This excludes use of titanium inthe pigment sector (at present the largest 
sector at around 90%) because our emphasis is on metals. 


going through a liquid phase™. Traditionally, the reduction agents 
have been carbon carriers (such as methane) in this process. Nowa- 
days, more complex gas mixtures can also be used, containing hydro- 
gen, carbon monoxide and/or methane. Compared to traditional blast 
furnace ironmaking, CO, emissions can be reduced through direct 
reduction by up to 50% depending on the hydrogen content of the 
used gas mixture. Hydrogen then acts as a reductant along with the 
carbon monoxide. 
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Fig. 2| Impact and technology readiness of sustainability measures for 
structural alloys. We present an overview of several strategies and suggested 
potentials for impact onthe sustainability of structural metals. Colours 
correspond to metals for which the opportunities are greatest (blue for carbon 


Hydrogen-enriched direct reduction is already available at industry 
scale. The competitive market entry of fully hydrogen-based direct 
reduction techniques without any CO, emissions is currently being 
explored and expected to enter the market around 2030. The primary 
reduction of aluminium is done via electrolysis so the main opportuni- 
ties for improving its sustainability involve using renewable electricity 
sources (a strategy that would benefit all metals), improved efficiency 
of the electrolysis cells, and lower consumption rates of the currently 
used graphite or alternative-material electrodes”. 

We next shift our attention downstream to manufacturing. In the 
downstream manufacturing following primary production, the overall 
yield losses occurring through liquid metal processing, forming and 
fabrication of aluminium and steel are 40% and 25% by mass, respec- 
tively””?8. These arise primarily from challenges involving the form of 
upstream products, the nature of upstream processing, the surface 
finish requirements, the supporting materials needed for shaping, and 
defects. Energy savings based on eliminating metal loss are estimated at 
around 5% and 15% for aluminium and steel, respectively. Alloy-specific 
high-quality material recovery already occurs inside casting and roll- 
ing plants where closed-loop procedures are established. Only a few 
producer-customer groups have established closed chains of returning 
alloy-specific scrap, so there are still substantial opportunities here that 
could be aided by data-driven approaches to process control and scrap 
sorting (see section ‘From geo-mining to urban mining’). 

Near-net-shape manufacturing methods, where parts are cast or 
printed witha shape close to the final shape (thus requiring less machin- 
ing) may provide an opportunity to reduce manufacturing material 
losses. Yet, additive manufacturing is not itselfa sustainable synthesis 
approach owing tothe high loss of powder, which can only be re-useda 
few times before it gets too oxidized (see Fig. 2 for the potential benefits 
of this strategy, most relevant for aluminium (Al), nickel (Ni) and tita- 
nium (Ti) alloys and tool steels)”’. However, large-scale near-net-shape 
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steel, orange for aluminium, green for nickel alloys and stainless steels, red for 
titanium). Where no colours are shown, all metals provide similar 
opportunities for that strategy. We include the qualitative potential (weighted 
by the relative scale of each industry) for each strategy by metal. 


manufacturing methods have advanced. Examples are thin-slab and 
thin-strip casting of steels and aluminium (medium-term readiness 
and scaled impact; see Fig. 2). In conventional casters, steel slabs have 
a thickness of 150-300 mm. The usual target of a hot band thickness 
of 2-3 mm thus requires >99% mechanical working, resulting in high 
energy and investment costs. Industrial thin-slab technology can pro- 
vide slabs of thickness 25-60 mm. For some steel grades, higher scrap 
fractions and higher contaminant content can be tolerated when cast- 
ing thin slabs, owing to the high solidification rates (resulting in less 
segregation and fewer intermetallics forming). The next step is thin- 
strip steel casting”. In this approach, liquid steel is cast between two 
rolls and exposed to a hot reduction step to manufacture strips 2-3 mm 
thick that are directly coilable. The reductionin plastic work, energy and 
investment is enormous, yet production speed is slow and thin strips 
often do not reachthe surface quality demanded by the market. Similar 
trends apply to strip production of aluminium alloys”. Several belt 
and twin roll techniques were developed, which are capable of casting 
aluminium strips with thickness 1-15 mm. Specific challenges lie in 
the resulting centre-line segregation and the associated formation of 
coarse intermetallics, which are particularly caused by typical scrap 
contaminants such as iron (Fe) and silicon (Si). The manufacturing of 
high-quality and high-strength automotive grades, as a typical high- 
end mass product, remains challenging. 


From geo-mining to urban mining 

One of the most efficient direct sustainability methods of reducing 
energy consumption and emissions lies in offsetting some primary 
extraction from geo-mined resources by urban mining of scraps and 
remelting them into new structural alloys. This can involve several 
specific strategies (outlined in Fig. 2) varying in readiness from scrap 
sorting and separation, to within-alloy family recycling (our preferred 
term for closed-loop recycling), alloy design for weight-reduction and 
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recycling-oriented alloy design. All have considerable opportunity to 
achieve a sustainable impact provided the market is responsive enough 
to offset some primary extraction*. The ability to do this effectively 
varies by alloy family and the source of the material (that is, whether 
it is post-industrial or post-consumer). By some measures for steel, 
particularly for the European Union and the USA, scrap availability 
is beginning to meet steel demand, so it will become critical to avoid 
contamination in recovery. Composition-sensitive steel recycling isa 
strategy that will offer more value than scrap-compatible steel design. 
Recovering post-consumer scrap requires the most sophisticated man- 
agement practices and has the most room for improvement” given that 
new material made from such scrap currently requires dilution witha 
large proportion of primary synthesized material in order to obtain 
the required composition. As the demand for products that can act as 
impurity sinks starts to decrease, we will have a smaller overall potential 
for recovery. Steel is typically recovered into the construction industry, 
a saturated market in some regions, and aluminium is recovered into 
cast products, so as the cast market becomes replete with scrap, the 
demand for cast products becomes a recovery limit®. 

Advanced sorting moves the downcycling that is prevalent in metal 
recovery fromthescrap yard tothe plant floor. Automated probing and 
sorting methods have traditionally suffered from high costs resulting 
from low throughput and these costs are tightly constrained by the 
margins within recovery industries. Given the variety of type, shape, 
size, and form of scrap material, it has proved challenging to developing 
awide array of broadly applicable recycling technologies. Such technol- 
ogy needs to cover identification, size reduction, separation, cleaning 
and material liberation. Separation steps include magnetic, sieving 
and air separation along with density separation, eddy-current, and 
spectroscopic techniques”. For these latter steps and even in scrapy- 
ard inspection, elemental analysers based on X-ray fluorescence and 
laser-induced breakdown spectroscopy have increasing potential. For 
instance, for certain alloys such as the more composition-sensitive 
aluminium-silicon-magnesium (AI-Si-Mg) alloy class, techniques such 
as aluminiumalloy separation by laser-induced breakdown spectroscopy 
are promising ways to screen and sort for specific doping elements 
suchas ironand copper (Cu). As alloys become more diverse, detection 
methods more sensitive, and throughput increases, these methods will 
become economically competitive”. When using such methods, the 
associated energy costs must be considered when comparing recycling 
to primary production®. 

Until scrap sorting is perfected, the challenge for the use of secondary 
materials ina broader range of products is impurity tolerance, particu- 
larly as increased use of scrap may lead to compositional drift of alloy 
streams”. This compatibility and potential for driftis particularly impor- 
tant for aluminium alloys, where the thermodynamics of the remelting 
processes dictate the fate of associated alloying and tramp elements, 
possibly leading to the formation of undesired intermetallic phases. 
Only around 20% of end-of-life scrap is turned into wrought aluminium, 
even though wrought products account for two-thirds of all aluminium 
inuse*°. The reason for this discrepancy is that many Al-Sialloys, used 
for cast products, are particularly tolerant regarding high scrap usage 
whereas ductile-sheet-forming variants based on the Al-Mg and Al-Mg-Si 
systems are very Sensitive to impurities. 

Steel recycling is primarily done in an electric arc furnace, but also 
about 10-20% of ore-based steelmaking is from scrap (used for cooling 
inan oxygen converter). Steel production in an electric arc furnace has 
lower total energy consumption, enabling more flexible use of scrap 
materials”, but there is reduced flexibility regarding which alloys can 
be made because the refining reactions inside an electric arc furnace are 
challenging. One of the key limitationsin steel recyclingis that during the 
separation process there is incomplete separation of copper-containing 
components, an important contaminant in this downstream process. 
The copper content ina shredding process can be upwards of 0.25-0.3% 
(although, when copper prices rise, there is more incentive to manually 
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handpick this material out ofthe stream). Tin can also cause downstream 
processing problems, particularly in combination with copper”. 

As mentioned above, the material cycle for nickel is closely linked to 
that of stainless steel, and this manifests in its recovery too. Stainless 
steel can be either ferritic or austenitic. The recycling rate is much higher 
for austenitic stainless steel (nickel-containing) than for ferritic stain- 
less steel. When recycled through a shredder plant, the ferritic stainless 
steel portion will be collected magnetically together with the ordinary 
carbon-steel scrap and will therefore be mixed into carbon steel. The 
stainless-steel scrap mixed into carbon-steel scrap was estimated to have 
reached 32% of postconsumer stainless-steel scrap flows in 2010*. Stud- 
ies have estimated that 80% of postconsumer nickel scrap is recovered 
within the nickel cycle, whereas 20% becomes a constituent of carbon 
and copper scrap (andis not recoveredas nickel“). Therefore, improving 
nickel recovery requires both avoiding its dissipation in carbon steels 
and the separation of low-nickel from high-nickel austenitic steels. Both 
measures are important for reducing nickel loss and avoiding nickel 
contamination of recycled carbon steel. 

Given the focus on titanium within the aerospace sector, there is con- 
siderable potential for recovery from scrap generated in the production 
process (Fig. 2; for example 100 tons of titanium alloy scrap is generated 
in making a frame for a 787 aeroplane). Because of the strict specifi- 
cations within the industries in which titanium is used, the long-lived 
nature of those products (translating to a lack of economic incentive 
based on lack of post-use volume), and challenges around oxygen and 
iron impurities, essentially all titanium recycling is post-industrial**. 
Post-industrial material can be re-introduced into the remelting step 
of the primary route to make titanium ingots. 

The challenge with improved titanium recycling even for industrial 
scrap is oxygen contamination, which can be decreased by remelting 
scrap with virgin titanium, but only upto a limit (particularly if demand 
grows). Novel processing technologies are focused primarily on oxygen 
removal technology as well as management of flexible scrap remelt- 
ing. Commercialized processes are focused on electrolytic refining of 
sponge titanium in molten salt and calcium deoxidation. Otherwise, 
less well sorted or more contaminated (for example, iron-contaminated 
titanium generated in the smelting process) titanium would be used 
predominantly within ferro-titanium (Fig. 1; global demand 60,000 
tons per year) or as an additive within other metal streams suchas steel, 
nickel, copper or aluminium. 

Another fundamental challenge associated with high scrap usage in 
production is that the properties (such as strength, toughness or cor- 
rosion resistance) may vary intolerably between two different furnace 
charges. Scrap-dependent heat treatment adjustment and the required 
blendingwith primary materialcouldbe predicted usingthrough-process 
computational materials engineering simulations that must be based on 
robust phase diagrams and kinetic databases. The customers primarily 
require specific properties rather than compositions, thatis, it may be 
possibleto correct composition-dependent property variations through 
adjusted, flexible and batch-specific downstream heat treatments, an 
area where data-driven methods might become helpful. 


Sustainable alloy design and recycling-friendly materials 

Progress and opportunities in recycling have focused on achieving an 
optimal fit of the collected and sorted scraps to existing alloys. Here we 
approach the task froma different perspective, namely, how the design 
of metallic alloys could be changed to render them more recycling- 
oriented upfront. This term refers to the capability of an alloy to be 
made from the highest possible fractions of (low-grade) scraps and 
at the same time to be compatible with other alloys when serving as 
scrap. This means that the elusive goal in optimized materials recovery 
isnot only to understand better the influence of impurity elements on 
properties, but also to build recyclability directly into the design of 
materials. Current structural alloys are not devised for end-of-life but 
rather for one-time use. Research into developing a science of less-pure 


and thus recycling-friendly alloys covers many aspects: small concentra- 
tion thermodynamics and kinetics; impurity trapping at lattice defects; 
compositional existence ranges of phases; size, dispersion and composi- 
tion of harmful intermetallic precipitates; solute-driven cohesion and 
decohesion effects and associated property changes. Optimization 
methods coupled with metallurgical design can suggest alloys whose 
compositional constraints can be modified towards more scrap-tolerant 
ranges while preserving performance”. 

This approach marks a shift in alloy design, which currently aims in 
part at realizing new properties through changes in chemical compo- 
sition. However, scrap compatibility in secondary production can be 
better realized when avoiding compositionally over-designed alloys 
and instead using materials from only a limited composition spectrum, 
where property tuning is achieved through microstructure adjustment. 
The best examples are steels, which provide hundreds of material vari- 
ants with different microstructures and properties, yet all based essen- 
tially on the iron-carbon-manganese-silicon (Fe-C-Mn-Si) system. 

When taking a closer look at the quest for recycling-friendly alloys, 
the two approaches (composition tuning and microstructure tuning) 
are not so different: while the approach of using only limited chemical 
variation holds, repeated processing of scraps leads to accumulation of 
contaminants in secondary synthesis. This turns simple alloy systems 
suchas Al-Mg-Si (used, for instance, in automotive sheet production) 
into a multi-component” material containing also iron, manganese, 
chromium, titanium, zinc and copper. Some of these impurities can lead 
to brittle phases. This means that recycling-oriented alloy design must 
study the low-dose corners of multi-component phase diagrams that 
may take into account up to twenty elements. A better understanding 
of multi-component thermodynamics and kinetics is thus animportant 
pillar of the design of more impurity-tolerant alloys*®. 

Arelated, but more disruptive, scrap-friendly alloy design approach 
consists in the design of crossover alloys, which are sometimes also 
referred to as broadband alloys or uni-alloys®. This approach aims to 
replacethe variety of over-designed alloys by asmaller number of materi- 
als each covering a broader range of properties to serve mass markets. 
For aluminium, where 250 specialized alloys are stocked but only 65 are 
regularly used, such crossover alloys could combine features of heat- 
treatable and non-heat-treatable wrought alloys at broad composition 
tolerance and with wide application ranges, establishing a universal alloy 
concept. A specific example is the improvement of the strain-hardening 
capacity, which is needed in sheet forming. This can be achieved both by 
ahigher solute magnesium content, owing to its effect on dislocation 
motion, but also by tuning crystallographic texture, reducing grain and 
dislocation cell sizes and improving precipitate dispersion. 

Similar aspects apply to Al-Mn alloys, used in recycling-intensive 
branches suchas packaging, which for non-safety-critical products can 
tolerate a large variation in composition (the impurity concentration of 
some contaminants suchas iron, silicon, zinc and magnesium can vary 
by factors of up to five among batches; Fig. 3). 


Longevity by corrosion protection, lifetime extension and re- 
use 

Enabling longer-lasting products would reduce resource extraction 
throughlifetime extension and repairability of products or by facilitating 
re-use (Fig. 2)°. We note, however, that increasing product lifetime will 
not reduce the demand arising from increasing population. Therefore, 
for developing regions where the population is growing this strategy 
is of limited value. The longevity of alloys is mainly limited by fatigue, 
creep, corrosion including hydrogen embrittlement, thermal ageing 
and irradiation. 

Fatigue is an effect of permanent microplastic deformation when a 
material is exposed to cyclic loading. This often occurs together with 
thermal and/or corrosive attack owing to the presence of oxygen and 
hydrogen, causing a phenomenon known as stress corrosion crack- 
ing. A similar scenario of gradual material decay is caused by creep, 


whichis aphenomenon where thermal activation enables plastic flow 
and microstructure coarsening of parts exposed to high homologous 
temperatures. 

Corrosion and stress-corrosion cracking are by far the most severe 
phenomena limiting the longevity and integrity of metal products, 
destroying about 3.4% of the global gross domestic product every 
year, a value translating to US$2.5 trillion (ref. *°) (see Fig. 2). Hence, 
any progress in corrosion resistance has large effects on longevity, most 
relevant for carbon steel. 

In this context, loss of material and system failure due to oxidation 
accounts for the vast majority of the economic impact of corrosion and 
is an essential factor in infrastructure costs worldwide. Oxidation of 
metallic structures proceeds mostly through galvanic corrosion, which 
occurs when adjacent microstructural regions or different metals with 
unlike electrochemical potentials are in conductive contact. The elec- 
trochemically more active region then acts as anode and corrodes faster 
than the cathodic reactant. Galvanic corrosion is the prevalent decay 
mechanism when metal structures are in contact with an electrolyte 
such as water with solute ions. 

Hydrogen embrittlement is another type of corrosion and poses a 
serious impediment for carbon-free hydrogen-propelled technologies. 
Unlike other corrosion products suchas oxides and hydroxides, hydro- 
gen is hard to detect and several embrittling effects can occur such as 
hydrogen-enhanced plasticity, decohesion, superabundant vacancies, 
hydride formation or nanovoids. The interplay among them makes 
it difficult to identify a clear cause of failure. Also, hydrogen-related 
damage can occur suddenly, causing abrupt catastrophic failure of 
structures. Hydrogen embrittlement can occur in structural alloys*?, 
particularly in iron, aluminium, nickel and titanium alloys with strength 
levels above 650 MPa. 

Alloy lifetime can also be reduced by thermal and radiation effects*, 
causing brittle phases, enhanced abundance and mobility of lattice 
defects or capillary-driven microstructure coarsening>>”*. The industrial 
relevance of this is huge. Many components in the energy industry, spe- 
cifically in nuclear reactors, can suffer from these phenomena, makingita 
field where safety issues can sometimes override sustainability concerns. 

Measures to reduce fatigue and creep damagein alloys use some of the 
intrinsic damage-resistance and crack closure mechanisms that metals 
offer*”°’. Examples are crack closure induced by plastic deformation, 
chemical reactions such as oxidation or athermal phase transforma- 
tion® © which are caused by the stress increase before a crack tip, trig- 
gering stress-driven phase transformation. This is often non-volume 
conserving, thus creating compressive stresses that can close crack 
tips. Another approach is diffusion-driven pore filling during creep®. 
Since most corrosion phenomena are interface-dominated reactions 
involving mass transport (mostly of metal and oxygen ions), corrosion 
protection (particularly against oxidation) isamong the most important 
and efficient means of enhancing product longevity. 

Corrosion protection methods are as varied as the underlying reac- 
tions and decay phenomena. Methods for mitigating oxidation depend 
onthe underlying electrochemical reactions andthe nature of the result- 
ing products (see steel in Fig. 2). Countermeasures may rely on shifting 
the thermodynamic direction of the oxidation reaction by providinga 
sacrificial anode, engineering alloy compositions to favour formation 
of a protective oxide or to disfavour formation of detrimental phases, 
or directly preventing oxygen from reaching the vulnerable material 
via protective coatings. (Technological schemes, such as impressed 
current cathodic protection, are also widely applied but are beyond 
the scope of this review.) 

Steel protection via zinc coatings is the most frequent application of 
asacrificial anode against atmospheric oxygen exposure and accounts 
for half of the global zinc production of 13.5 Mt per year. Hence, envi- 
ronmental considerations apply also to zinc production and recycling 
when improving the longevity of steels. Some metals, such as many 
aluminium, titanium, nickel and stainless-steel alloys, form dense, 
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adherent, self-healing oxidation products that resist corrosion intrinsi- 
cally by preventing further oxygen intrusion. Even in these materials, 
interfacial segregation and second phase formation can contribute 
corrosive damage. Finally, engineering protective coatings is an artin 
itself, since an imperfect or permeable coating can actually enhance 
the corrosion it is meant to prevent. 

Protecting materials against hydrogen is challenging: whereas some 
materials such as titanium undergo formation of brittle hydrides, oth- 
ers suchas nickel and high-strength steels experience enhanced local 
plasticity and void formation. Measures that may reduce hydrogen 
embrittlement are the reduction of regions of high micromechanical 
contrast among phases and microstructure components (as hydrogen 
tends to enrichin highly stressed regions suchas at interfaces), dense 
oxide surface layers which reduce hydrogen uptake, and trapping of 
hydrogen at semi-coherent internal interfaces and supposedly also at 
other defects** ©. In some cases, hydrogen trapping can bea cause of 
undesired local softening, such as through the stabilization of supera- 
bundant vacancies or the lowering of the activation barriers for the 
double-kink mechanism of dislocation motion”. 

When damage becomes visible, repairs can be done by cladding, weld- 
ing or grinding for components ranging from bridges” to turbine 
blades”)””, Repairs can be conducted even when damageis not visible, 
for example, through maintenance treatments. There has been consider- 
able research on self-healing metals, in which the main goal is that the 
material should have autonomous crack closure mechanisms”. Yet 
ambient temperature sluggishness of transformation kinetics in met- 
als” limits the success of a truly autonomous crack closure mechanism 
to only afew case studies”. Most other cases require external treatments. 
The repairability of metals can be increased by focusing on removing 
microstructural changes that lead to embrittlement, instead of focusing 
only on micro-crack closure. This would enable prolonged utilization 
of intrinsic damage resistance that is intrinsic, but consumable (that is, 
it is gradually used up) in alloys exposed to repeated mechanical loads, 
whicharise from plasticity””* or transformation**”’ micro-mechanisms. 
Thermally induced embrittlement effects in duplex stainless steels, 
which is due to G-phase formation (a CrNi-silicide), can be removed 
by annealing, enabling the part’s re-use®°. Similarly, cut-edge damage 
in sheet metal can be reduced by specifically designed cutting treat- 
ments™. Furthermore, such microstructure resetting strategies could 
enable further re-manufacturing processes for sheet metal that would 
increase re-use. 

Approaches to improving the sustainability of structural metals 
are supported by progress in computational methods. Metallurgists 
can now make use of databases of experimental data surrounding 
structure-property relationships, loading-specific precipitation, 
coarsening, phase transformation and even complete-lifetime predic- 
tions*®*36*, Data-driven approaches can use machine learning meth- 
ods*** that can sometimes be computationally more tractable than 
simulation-based approaches that aim to avoid damage-susceptible 
microstructures (to reduce failure)” or to make predictions for when 
to apply repair treatments and howalloy compositions can be rendered 
more compatible for recycling. 

Enabling re-use provides considerable opportunities for steel and 
aluminium, given that many applications in building and transportation 
reach end-of-life not because they fail but because they are replaced for 
economic reasons. Barriers to re-use are typically not technical in nature 
but rather economic, suchas lack of demand, traceability concerns and 
lack of supply chain infrastructure”. These systemic barriers need to be 
addressed to realize re-use potential through government leadership, 
education and information sharing. 


Energy efficiency by lightweighting and harsh operating 
conditions 


Metallurgical improvements can increase the performance and energy 
efficiency of industrial systems, products and processes simultaneously, 
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Fig. 3 | Multi-component alloys for high scrap usage. Alloys made from high 
scrap fractions can contain multiple contaminants, which means that they must 
be designed initially to be impurity-tolerant. This requires knowledge about 
low-concentration, multi-component phase diagram regions. The schematic 
showsa phase diagram with typical engineering alloys inthe corners, where one 
element prevails, and intermetallics in the binary and ternary centres with fixed 
stoichiometries and high- and medium-entropy alloys as solid-solutions inthe 
centres of the multi-component phase diagrams. The presence of scrap-related 
contaminants makes this a low-concentration, multi-component phase 
diagram. Belowis an atom probe tomography dataset taken from arecycled 
Al-Mn base alloy used for packaging, revealing a high number of tolerable 
impurity elements, each below 0.1%. Such recycled Al-Mn alloys canalso serve in 
infrastructure applications. 


at reduced waste and greenhouse gas emissions. Here we tackle two 
pathways towards increasing ‘in-use’ energy efficiency: lightweight- 
ing (alloy design for weight reduction, Fig. 2) and increased operating 
temperature (efficiency through high-temperature materials, Fig. 2). 

Worldwide, approximately 12% of steel products (about 121 Mt per year) 
and about 27% of aluminium products (12 Mt per year) are used in trans- 
portation, incentivizing efforts to reduce weight in automotive” ®, aero- 
space®* *’ and railway”’ components. Inthe case of vehicle lightweighting, 
the potential is considerable given that 20% of our global energy and pro- 
cess CO, emissions originate from transportation and about 20% of that 
could be reduced through lightweighting””°. We note that, historically, 
vehicles have become heavier over time with improvements to comfort, 
performance and safety, even as our ability to lightweight the vehicle has 
improved. Lightweighting has sustainability benefits for other industries 
linked with transportation as well, suchas the packaging (about 9 Mt per 
year of steel, about 6 Mt per year of aluminium) or constructionindustries 
(583 Mt per year of steel, 11 Mt per year of aluminium) (see Fig. 1). Efforts 
to reduce materials use through redesign, including alloy development 
and particularly microstructure tuning, could remove as muchas 30% of 
steel and aluminium by mass from transportation uses. 

In each of these industries, weight can be reduced (1) by using less 
metal, compensated by higher strength, (2) by using metal of lower den- 
sity, or (3) by optimizing component design.(1) Using less metal means 
improved strength or elastic modulus properties must be achieved, 
while avoiding a decrease in toughness and ductility. There are several 
strengthening mechanisms that can be triggered in metals by optimizing 


thermo-mechanical processing and/or composition, but many of them 
reduce ductility and toughness. Therefore, designing metals that exhibit 
simultaneous increases in these properties has been an essential goal 
of metallurgical research?"”’'™, The most effective mechanisms for 
improving strength and ductility simultaneously have been by phase 
metastability. Transformation-induced plasticity (TRIP) and twinning- 
induced plasticity (TWIP) mechanisms have been extensively used for 
this purpose, in TRIP-assisted”"”’ or TWIP-assisted’°* ™ steels and 
titanium alloys™"” through ‘metastability’ tuning of the stacking fault 
energy, achieved mainly by adjusting the carbon and manganese con- 
tent and composition partitioning among phases. A more fundamental 
challenge is regarding the end product of strain-induced martensitic 
transformation: martensite. Although its transformation reaction 
has beneficial effects, the resulting fresh martensite and its bounding 
hetero-interfaces are often sites of damage nucleation”. Similarly, the 
elastic modulus is an essential design consideration for vehicle mass 
reduction. Several high-stiffness metal matrix composites have been 
developed using stiff ceramic precipitates. Examples are Al-TiB, and 
Fe-TiB,. Some of these materials provide improved stiffness of up to 
10% with comparable formability. Upscaling depends on the capability 
to produce suchalloys in larger quantities, for example, through in situ 
liquid metallurgy”. 

(2) For the second approach to lightweighting, one goal is reducing 
the density of steels. When blending Fe with up to 25% manganese 
and up to 1.2% carbon the steel crystallizes into a face-centred-cubic 
structure, tolerating up to 8 atomic per cent aluminium in solid solution 
without formation of brittle phases. These materials are referred to as 
low-density steels and havea tensile strength up to 1.5 GPa at up to 60% 
elongation. When adding 20 atomic per cent Al to Fe-Mn-C alloys the 
mass density is reduced by as much as 15%, yet with reduced Young’s 
modulus and precipitation of perovskitic carbides’. 

Another avenue for weight reduction makes use of aluminium- 
based*"* and magnesium-based’” alloys. Current research is focused 
onachieving a better strength—ductility compromise for sheet applica- 
tions, realizing ultrahigh-strength Al-Zn-Mg-Cu”®, as well as weight- 
reduced and stiffness-enhanced AI-Li alloys”. Several even lighter 
alloys are currently being developed, based on the Mg-Al-Zn"», Mg- 
Al-Ca"$"” and Mg-rare-earth’”° systems, with a mass density as low as 
about 1.7 g cm, that is, about 80% reduced density compared to steels. 
Extreme weight reduction, but with insufficient corrosion resistance 
was realized in a Mg-Li alloy that approaches a mass density of only 
about 1g cm (that is, that of water”). 

(3) Design improvements can also lead to weight reductions. Here, 
our focus is not on topology optimization’, but on mesoscale optimi- 
zation of the spatial distribution of microstructure features”. In cases 
where processing is feasible, grading of structure or composition can 
enhance properties* ”’, enabling lightweighting. Materials design- 
ers have explored gradients of grain size’*””"”’, twin density”?™, 
component or phase fraction’. These investigations demonstrated 
that the underlying physical micro-mechanisms of deformation can 
be influenced by such grading, leading to improvements in plastic- 
ity’? strengthening’, and damage resistance”, However, 
processes that realize gradients are difficult to scale up (for example, 
accumulative roll bonding”””’, thin film deposition”°). To this end, 
additive manufacturing methods offer potential for the synthesis of 
graded materials, and can be also employed to create functionally 
graded structures “°, although costs and production speed need 
to be improved. 

Another field of interest is the overlap between additive manufac- 
turing, alloy design, architectured materials, computational materials 
mechanics and bionics. Bionic design enables computer-generated 
topology-optimized lean geometries of parts at reduced mass and 
improved structural stiffness’”"”’ (Fig. 4). 

Another domain where structural alloys offer improved efficiency 
is the use of higher operational temperatures in energy conversion”. 
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Fig. 4| Synergies between structural optimization, additive manufacturing, 
architectured materials and bionics. Example of a three-dimensional 
cantilever beam design using these synergies. The upper row shows an 
architectured lattice structure with different cellular densities (CD) ranging 
from10% to 75%. The middle row shows different combinations of volume 
fraction (VF) and CD fora part obtained by topology optimization. The four 
combinations all yield the same reduction in weight (94%) relative to the massive 
(un-architectured, not porous) cantilever beam with 100% VF and 100% CD. The 
image was compiled using results from ref. . 


Such engines follow the Carnot cycle, and hence higher service tem- 
peratures yield better efficiency. As global electricity consumption 
amounts to 23 PW h (10” W h) and electricity is the fastest-growing 
source of energy demand, most of it provided by turbines, higher con- 
version efficiency has huge potential for saving energy. This should 
stimulate research on nickel-based and cobalt-based superalloys, Ti- 
aluminides and Mo-Si-B alloys (Fig. 2). 


Outlook on enhanced sustainability of structural metals 
Structural metallic alloys have served as key enablers of human pro- 
gress, wealth and wellbeing over millennia. Now, in the acceleration 
phase of the Anthropocene, their great advantages in terms of avail- 
ability, mass producibility and low price have also become an environ- 
mental burden. Considering the huge quantities produced (1.7 billion 
tons of steel and 94 million tons of aluminium per year), the task of 
making structural alloys and their products sustainable is an enormous 
challenge. Metals-related sustainability solutions now require a holistic 
view of production and manufacturing, and a thorough metallurgical 
understanding of structure—-property relations, product function and 
longevity, resource efficiency, pollution, market dynamics and societal 
impact”. These aspects are closely connected, with some of them being 
of thermodynamic nature (and therefore quantifiable), whereas others 
are harder to assess, such as customer response to green branding or 
market development"’. We therefore recommend the use of market- 
informed life-cycle assessment calculations for adequate risk and effect 
quantification before political and industrial decision-making, inorder 
to reveal the true long-term efficiency gains of the various possible 
sustainability measures. 

To improve the sustainability of metals production and use, 
several recommendations with both high leverage and high technology- 
readiness follow from this overview. The greatest potential is attributed 
to the use of fossil-free and fossil-reduced energy sources in primary 
and secondary extraction and manufacturing as well as production 
methods that allow the dynamic use of green electricity. Also of great 
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importance are improved corrosion resistance; materials-efficient 
manufacturing; improved product-to-product recycling; automated 
post-consumer scrap sorting; recycling-oriented alloy design andthe 
development of multi-purpose crossover alloys. 

Acritical dimension inrealizing many of the opportunities mentioned 
here that we have left untouched in our discussion is the importance of 
government action and economics. In the absence of regulations, the 
only driving force for emission reductions would be those strategies 
that can demonstrate economic benefit. The role of policy would be 
to leverage quantitative assessments to incentivize the most effective 
(in terms of sustainability) strategies”. Several current metal-industry- 
related policies are inadequate, focusing on technology-specific 
deployment substitutes, or are not sufficiently transformative, there- 
fore leaving potential for lock-in risk’. Measures that deserve immedi- 
ate attention include pollution and emission controls across national 
borders, demand stimuli such as procurement mandates or recycling 
incentives that consider bothsources and sinks for recovered metal, and 
supply measures that are technology-neutral but set benchmarks and 
standards for clean manufacturing while supporting scale-up needs, 
all coupled to supporting infrastructure and market analysis to enable 


the most economically viable strategies compatible with these sustain- 
152 


ability goals’. 

Systematically implementing the measures discussed will break with 
almost all traditions of our current industrial practice since the begin- 
ning of the first industrial revolution around 1800. Striving towards 
sustainability will become the next industrial revolution. 
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Rapid growth in the market for electric vehicles is imperative, to meet global targets 
for reducing greenhouse gas emissions, to improve air quality in urban centres and to 
meet the needs of consumers, with whom electric vehicles are increasingly popular. 
However, growing numbers of electric vehicles present a serious waste-management 
challenge for recyclers at end-of-life. Nevertheless, spent batteries may also present an 
opportunity as manufacturers require access to strategic elements and critical 
materials for key components in electric-vehicle manufacture: recycled lithium-ion 
batteries from electric vehicles could provide a valuable secondary source of materials. 
Here we outline and evaluate the current range of approaches to electric-vehicle 
lithium-ion battery recycling and re-use, and highlight areas for future progress. 
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Theelectric-vehicle revolution, driven by theimperatives to decarbonize 
personal transportation in order to meet global targets for reductions 
in greenhouse gas emissions and improve air quality in urban centres, is 
set to change the automotive industry radically. In 2017, sales of electric 
vehicles exceeded one million cars per year worldwide for the first time’. 
Making conservative assumptions of an average battery pack weight 
of 250 kg and volume of half a cubic metre, the resultant pack wastes 
would comprise around 250,000 tonnes and halfa million cubic metres 
of unprocessed pack waste, when these vehicles reach the end of their 
lives. Although re-use and current recycling processes can divert some 
of these wastes from landfill, the cumulative burden of electric-vehicle 
waste is substantial given the growth trajectory of the electric-vehicle 
market. This waste presents anumber of serious challenges of scale; in 
terms of storing batteries before repurposing or final disposal, inthe 
manual testing and dismantling processes required for either, and in 
the chemical separation processes that recycling entails. 

Given that the environmental footprint of manufacturing electric 
vehicles is heavily affected by the extraction of raw materials and pro- 
duction of lithium ion batteries, the resulting waste streams will inevi- 
tably place different demands on end-of-life dismantling and recycling 
systems. Inthe waste management hierarchy, re-use is considered pref- 
erable to recycling (Fig. 1). Because considerable value is embedded in 
manufactured lithium-ion batteries (LIBs), it has been suggested that 
their use should be cascaded through a hierarchy of applications to 


optimize material use and life-cycle impacts’. Markets for energy storage 
are under development as energy regulators in various locations transi- 
tion to cleaner energy sources. Energy storage is particularly sought- 
after in areas where weak grids require reinforcement, where high 
penetration of renewables requires supply to be balanced with demand, 
where there is an opportunity for trading energy with the grid and in off- 
grid applications. Second-use battery projects have started to develop 
inlocations where there is regulatory and market alignment. However, 
large concentrations of waste—be it for refurbishment, re-manufacture, 
dismantling or final disposal—can create substantial challenges. A fire 
in stockpiled tyres in Powys, Wales, for example, smouldered for fifteen 
years from 1989 to 2004. Since the electrode materials in LIBs are far 
more reactive than tyre rubber’, without a proactive and economically 
sound waste-management strategy for LIBs there are potentially greater 
dangers associated with stockpiling of end-of-life LIBs. Already the 
number of fires being reported in metal-recovery facilities is increas- 
ing*, owing to the illicit or accidental concealment of (consumer) LIBs 
in the guise of, for example, lead-acid batteries. Among examples of 
recent major fires are those that took place in metal-recovery facilities 
in Shoreway, San Carlos, USA, in September 2016°, Guernsey in August 
2018 and Tacoma, Washington, USA, in September 2018. 

Waste may also represent a valuable resource. Elements and materials 
contained in electric-vehicle batteries are not available in many nations 
and access to resources is crucial in ensuring a stable supply chain. Inthe 
future, electric vehicles may prove to bea valuable secondary resource 
for critical materials, and it has been argued that high-cobalt-content 
batteries should be recycled immediately to bolster cobalt supplies®. 
Iftens of millions of electric vehicles are to be produced annually, care- 
ful husbandry of the resources consumed by electric-vehicle battery 
manufacturing will surely be essential to ensure the sustainability of 
the automotive industry of the future, as will a material- and energy- 
efficient 3R system (reduce, re-use, recycle). Here we give an overview 
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Fig. 1| The waste management hierarchy and range of recycling options. The 
waste management hierarchy is a concept that was developed from the Council 
Directive 75/442/EEC of 15 July 1975 (https://eur-lex.europa.eu/legal-content/ 
EN/TXT/?uri=CELEX%3A31975L0442) on waste by the Dutch politician Ad 
Lansink, in1979, who presented to the Dutch parliament a simple schematic 
representation that has been termed ‘Lansink’s Ladder’, ranking waste 
management options fromthe most to least environmentally desirable options. 
Here, that hierarchy is expanded to consider the range of battery recycling 
technologies. ‘Prevention’ means that LIBs are designed to use less-critical 


ofthe current state of the art and identify some of the important issues 
relating to the end-of-life management of electric-vehicle LIBs. 


Social and environmental impacts of LIBs 


If we consider the two main modes of primary production, it takes 
250 tons of the mineral ore spodumene’® when mined, or 750 tons of 
mineral-rich brine’* to produce one ton of lithium. The processing of 
large amounts of raw materials can result in considerable environmental 
impacts’. Production from brine, for example, entails drilling a holein 
the salt flat, and pumping of the mineral-rich solution to the surface. 
However, this mining activity depletes water tables. In Chile’s Salar de 
Atacama, a major centre of lithium production, 65% of the region’s water 
is consumed by mining activities’. This affects farmers in the region who 
must then import water from other regions. The demands on water 
from the processing of lithium produced in this way are substantial, 
with a ton of lithium requiring 1,900 tons of water to extract, which is 
consumed by evaporation’. 

By contrast, secondary production would require only 28 tons of 
used LIBs’*”° (around 256 used electric-vehicle LiBs®). The net impact 
of LIB production can be greatly reduced if more materials can be 
recovered from end-of-life LIBs, in as close to usable formas possible". 
However, in the rapid-growth phase of the electric-vehicle market, 
recycling alone cannot come close to replenishing mineral supplies”. 
LIBs are anticipated to last 15-20 years” based on calendar aging 
(the aging due to time since manufacture) predictions—three times 
longer than lead-acid batteries”. Initial concerns regarding 
resource constraints for LIB production scale-up focused on lithium”; 
however, inthe near term, reserves of lithium are unlikely to presenta 
constraint’, 

Of greater immediate concern are cobalt reserves”, which are geo- 
graphically concentrated (mainly inthe politically unstable Democratic 
Republic of the Congo). These have experienced wild short-term price 
fluctuations and raise multifarious social, ethical and environmental 
concerns around their extraction, including artisanal mines employing 
child labour”. In addition to the environmental imperative for recycling, 
there are clearly serious ethical concerns with the materials supply 
chain, and these social burdens are borne by some of the world’s most 
vulnerable people. Given the global nature of the industry, this will 
require international coordination to support a concerted push towards 
recycling LIBs and a circular economy in materials’. 
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Range of recycling technologies 


Advanced battery recycling: automated disassembly 


‘Mixing’ of Amount of Value of 
materials materials materials 
streams recovered recovered 


Present battery recycling: shredding, pyrometallurgy 


materials (high economic importance, but at risk of short supply) and that 
electric vehicles should be lighter and have smaller batteries. ‘Re-use’ means 
that electric-vehicle batteries should havea second use. ‘Recycling’ means that 
batteries should be recycled, recovering as much material as possible and 
preserving any structural value and quality (for example, preventing 
contamination). ‘Recovery’ means using some battery materials as energy for 
processes suchas fuel for pyrometallurgy. Finally, ‘disposal’ means that no value 
is recovered and the waste goes to landfill. 


Battery assessment and disassembly 


The waste-management hierarchy considers re-use to be preferable to 
recycling (Fig. 1). As considerable value is embedded in manufactured 
LIBs, it has been suggested that their use should be cascaded through 
a hierarchy of applications to optimize material use and life-cycle 
impacts”. Energy stored over energy invested (ESOI)—the ratio between 
the energy that must be invested into manufacturing the battery andthe 
electrical energy that it will store over its useful life—is a metric used to 
compare the efficacy of different energy-storage technologies. Clearly, 
ESOl figures will improve if end-of-life electric-vehicle batteries can be 
used in second-use applications for which the battery performance is 
less critical. 

Profitable second-use applications also provide a potential value 
stream that can offset the eventual cost of recycling, and already a 
healthy market is developing in used electric-vehicle batteries for energy 
storage in certain localities, with demand potentially outstripping sup- 
ply. For the moment the economics of the decision whether to recycle 
or re-use are set firmly in favour of re-use. The main factors are (1) the 
refurbishment cost of putting the battery into asecond-use application 
and (2) any credit that would accrue as the result of recycling the bat- 
tery instead; if the second-use price were to fall below the sum of the 
refurbishment cost and the recycling credit, then recycling would be 
the economically favoured option”. In time, itis anticipated” that the 
supply of used electric-vehicle batteries will far exceed the quantity that 
the second-use market can absorb. It must be remembered, therefore, 
that—if disposal to landfill is to be avoided—recycling must be the ulti- 
mate fate of all LIBs, even if they first have a second use. 

Given that stockpiling of waste batteries is potentially unsafe and 
environmentally undesirable, if direct re-use of an LIB module is not 
possible, it must be repaired or recycled. End-of-life LIB recycling could 
provide important economic benefits, avoiding the need for new min- 
eral extraction” and providing resilience against vulnerable links”! 
and supply risks” in the LIB supply chain. For most remanufacture and 
recycling processes, battery packs must be disassembled to module 
level at least. However, the hazards associated with battery disassem- 
bly are also numerous”. Disassembly of battery packs from automo- 
tive applications requires high-voltage training and insulated tools 
to prevent electrocution of operators or short-circuiting of the pack. 
Short-circuiting results in rapid discharge, which may lead to heating 
and thermal runaway. Thermal runaway may result in the generation of 
particularly noxious byproducts, including HF gas”, which along with 


other product gases may become trapped and ultimately result in cells 
exploding”. The cells also present a chemical hazard owing to the flam- 
mable electrolyte, toxic and carcinogenic electrolyte additives, and the 
potentially toxic or carcinogenic electrode materials. 


Diagnostics of battery pack, modules and cells 

‘State of health’ is the degree to which a battery meets its initial design 
specifications. Over time as the battery degrades, its performance var- 
ies fromits initial condition. The units are percentage points, with 100% 
indicating a state of health that is identical to that of anew battery meet- 
ing its design specification. (Some new batteries may leave the factory 
deviating from design specifications, and having less than 100% state of 
health.) The ‘state of charge’ is the degree to whicha battery is charged 
or discharged. Again, the units are percentage points, with O% indicating 
empty and 100% indicating full). 

Battery repurposing—the re-use of packs, modules and cells in other 
applications suchas charging stations and stationary energy storage— 
requires accurate assessment of both the state of health, to categorize 
whether batteries are best suited for re-use (and if so, for which applica- 
tions), remanufacture or recycling, and the state of charge, for safety 
reasons in some recycling processes. For high-throughput triage and 
gateway testing of batteries at scale, the optimal approach involves 
in situ techniques for monitoring cells in service to enable advance 
warning of possible cell replacement, and module or pack recondition- 
ing, rather than complete repurposing at a low level of state of health 
owing toa few failing cells. 

Electrochemical impedance spectroscopy can give information on 
the state of health of cells, modules and, potentially, full packs?°, and 
also an indication of aging mechanisms such as lithium plating. Such 
measurements have the potential to inform a decision matrix for re-use 
or disassembly and processing and, importantly, to identify potential 
hazards that would have further consequences for downstream process- 
ing. Electrochemical impedance spectroscopy has been researched for 
gateway testing in primary production, for example, ina large battery 
production plantin the UK”’”®. Anumber of electric-vehicle manufactur- 
ers plan to use similar technologies to manage and maintain electric- 
vehicle battery packs through the identification and replacement of 
failing modules in the field. Substantial advantages in cost, safety and 
throughput time are anticipated if this process can be mostly or fully 
automated’, In future, more advanced diagnostic functionality will 
be embedded in battery management systems, providing data that can 
be interrogated at end-of-life. 


Challenges of pack and module disassembly 
Different vehicle manufacturers have adopted different approaches 
for powering their vehicles, and electric vehicles on the market pos- 
sess a wide variety of different physical configurations, cell types 
and cell chemistries. This presents a challenge for battery recycling. 
Figure 2 details three different types of battery cell design, and their 
respective packs from electric vehicles in the marketplace from model 
year 2014. It can be seen that the three vehicles possess very differ- 
ent physical configurations, requiring different approaches for dis- 
assembly, particularly regarding automation. It can be seen in Fig. 2 
that at the different scales of disassembly, the format and relative size 
of the different components differ, presenting challenges for auto- 
mation. The differing form factors and capacities may also restrict 
applications for re-use. And finally, Fig. 2 illustrates that manufactur- 
ers employ varying cell chemistries (see Fig. 3), which will necessitate 
different approaches to materials reclamation and strongly affect the 
overall economics of recycling. Whereas the prismatic and pouch cells 
have planar electrodes, the cylindrical cells are tightly coiled, presenting 
additional challenges to separating the electrodes for direct recycling 
processes. 

For repurposing and second-use applications, automotive battery 
packs are currently dismantled by hand for either the second use of 


the modules or for recycling. The weights and high voltages of trac- 
tion batteries mean that qualified employees and specialized tools are 
required for such dismantling”. This is a challenge for an industry in 
transition with a shortage of skills. An Institute of the Motor Industry 
survey found only 1,000 trained technicians in the UK capable of servic- 
ing electric vehicles”, with another 1,000 in training. Given there are 
170,000 motor technicians in the UK, this represents less than 2% of the 
workforce. There is concern that untrained mechanics may risk their 
lives repairing electric vehicles*!, and these concerns logically extend 
to those handling vehicles at the end-of-life. Additionally, it has been 
suggested™ that manual dismantling, in countries with high labour 
costs, is uneconomic with respect to revenues from extracted materi- 
als or components. Vehicle design has to strike compromises between 
crash safety, centre of gravity and space optimization, which must be 
balanced against serviceability’. These conflicting design objectives 
often result in designs that are not optimized for recyclability, and that 
can be time-consuming to disassemble manually”. 


Automating battery disassembly 

Robotic battery disassembly could eliminate the risk of harm to human 
workers, and increased automation would reduce cost, potentially mak- 
ing recycling economically viable. This is being piloted in a number 
of current research projects® **. Importantly, automation could also 
improve the mechanical separation of materials and components, 
enhancing the purity of segregated materials and making downstream 
separation and recycling processes more efficient. The automation of 
the dismantling of automotive batteries, however, presents major chal- 
lenges. This is because robotics and automation in the manufacturing 
sector rely on highly structured environments, inwhichrobots make pre- 
programmed repetitive actions with respect to exactly known objects 
in fixed positions. In contrast, the development of robotic systems that 
can generalize toa variety of objects, and handle uncertainty, remains 
amajor challenge at the frontier of artificial intelligence research. It is 
important to consider the complexity of vehicle battery disassembly 
from this perspective. 

At present there is no standardization” of design for battery packs, 
modules or cells within the automotive sector, and it is unlikely that 
this will happen inthe near future. Other battery-reliant products, such 
as mobile phones, have seen an exponential proliferation of different 
sizes, shapes and types of battery over the past two decades. At present, 
much of the factory assembly of these batteries is done by human work- 
ers and remains unautomated. Their disassembly and waste-handling 
typically involve even less structured environments, with much greater 
uncertainties, than a manufacturing assembly line. 

Nevertheless, some progress has been made towards automated 
sorting of consumer batteries. The Optisort system**”” uses computer 
vision algorithms to recognize the labels on batteries, and then pneu- 
matic actuators to segregate batteries into different bins according to 
their type of chemistry. However, Optisortis currently limited to AA and 
AAA batteries, and a large amount of pre-sorting by hand is needed to 
separate these from mixed batches of waste batteries, prior to entering 
the Optisort machine. 

The Society for Automotive Engineers and the Battery Association 
of Japan have both recommended labelling standards for electric- 
vehicle batteries. Recent algorithms from computer vision research 
have some capability to recognize objects and materials on the basis 
of features suchas size, shape, colour and texture. However, it could be 
advantageous for recycling if manufacturers were to (some manufactur- 
ers already do) include labels, QR Codes, RfID tags or other machine- 
readable features on key battery components and sub-structures. 
Where these provide a reference to an external data source, its utility 
in aiding the recycling process will depend on the accessibility and 
format of that data. If proprietary and private, such data are of limited 
use, but there may be initiatives to move towards standardization and 
open data formats. A number of companies are considering blockchain 
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Fig. 2| Examples of three different battery packs and modules (cylindrical, 
prismatic and pouch cells) inusein current electric cars. The three designs 
examined are from model year 2014; this is based on the availability of 
information from vehicle teardowns, and also because older vehicles are more 
likely to be closer to end-of-life than today’s new cars. The breakdowns include 
material content inacell, layout and content of the module and pack and the 
proportion of critical elements (high economicimportance, but at risk of short 
supply) and strategic materials (either high economic importance or risk of 
short supply) used. The Nissan pouch cells from Automotive Energy Supply 
Corporation (AESC) exhibit a mixed cathode chemistry with substantial 
manganese content and relatively low levels of cobalt. The Tesla cylindrical 
18650 cells from Panasonic and the BMW prismatic cells from Samsung SDI both 


technologies to provide whole-life-cycle tracking of battery materials, 
including information and transparency on provenance, ethical sup- 
ply chains, battery health and previous use*°. China has signalled its 
intention to track battery materials. 

Automated disassembly of electrical goods has also been imple- 
mented to some extent in other sectors. For example, Apple has imple- 
mented an automated disassembly line for the iPhone 6“ that can handle 
1.2 million phones per year. This line has 22 stations linked ona conveyor 
system and can take the iPhone apart in 11 seconds. However, this system 
can only deal with an iPhone 6 model. Intact phones, of this exact model, 
must be positioned at the start of the disassembly line, which then uses 
pre-programmed motions of 29 robots in 21 different cells to dismantle 
the phone into 8 discrete parts. The LIB is removed by heating the glue 
which holds the battery in place. Owing to the potential fire hazard, 
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contain high cobalt levels. Each cell has particular recycling challenges. 
Cylindrical cells are often bonded into a module using epoxy resin (difficult to 
remove or recycle); fuses at each end may be blown, making cell discharge 
challenging; and the cell geometry can be difficult to dismantle for direct 
recycling. Prismatic cells require ‘can opening’ (requiring special tools) to 
remove the contents. These large cells are under considerably more pressure 
thanare the pouchor cylindrical cells, and can therefore be hazardous to openif 
the contents have degassed. The high manganese content of the Nissan pouch 
cells makes pyrometallurgical recycling less cost-effective, because manganese 
is cheap, but these cells are the least problematic to openand physically separate 
for direct recycling. 


this must take place inside a thermal event protection system, while 
monitoring the battery using a thermal imaging system. 
Unfortunately, 1.2 million phones per year is a drop in the ocean 
and the Apple disassembly line has been created using conventional 
industrial automation methods, making it inflexible and incapable of 
keeping up adaptively with new models and varieties of phones. But 
building a flexible and adaptable robot disassembly line need not be pro- 
hibitively expensive. The challenge is to create control algorithms and 
software that can make cheap hardware (robot arms cost only several 
thousands to several tens of thousands of dollars and costs have been 
steadily decreasing, can work all the time, and have very long service 
lifetimes) behave flexibly and intelligently to handle hugely complex 
disassembly problems. If those artificial intelligence challenges can 
be solved, then the capital investment required to respond to new and 
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Fig. 3| LIB cathode chemistries. The term LIB covers arange of different 
battery chemistries, each with different performance attributes. The basic 
concept of a LIB is that lithium can intercalate into and out of an open structure, 
which consists of either ‘layers’ or ‘tunnels’. Generally the anode is graphite but 
the cathode material may have different chemistries and structures, which 


changing models could be kept remarkably low (mainly software 
updates would be needed). Making robots behave intelligently will 
rely heavily on sensors to enable advanced robotic perception, espe- 
cially computer vision using three-dimensional RGB-D imaging devices, 
combined with bespoke sensors from materials and battery experts. The 
robots will also require tactile and force-sensing capabilities to handle 
the complex dynamics problems of forceful interactions between the 
robots and the materials being disassembled. 

Owing tothe complexity of automotive battery packs, the possibil- 
ity of collaborative human-robot co-working using a new generation 
of force-sensitive ‘co-bot’ robot arms™*” has been suggested. Unlike 
conventional industrial robots, these co-bots can safely share a work- 
space with humans, and Wegener* suggests that the robot could be 
taught tasks such as unscrewing bolts, while the human handles cog- 
nitively more complex tasks. However, this approach does not protect 
the human worker from battery hazards and even the task of locating 
a bolt, moving a tool to engage with it, unscrewing and removing it 
represents a cutting-edge research challenge in robotics and machine 
vision. Using current industrial robotics methods, the problem only 
becomes attemptable (but still difficult) provided that the position 
of the bolt head is always exactly fixed, ina known pose relative to the 
robot, with very high precision. 

State-of-the-art robotics, computer vision and artificial-intelligence 
capabilities for handling diverse waste materials do exist, and these 
systems have demonstrated sufficient robustness and reliability to gain 
acceptance by the UK nuclear industry, for example, inthe deployment 
of artificial-intelligence-controlled, machine-vision-guided robotic 
manipulation for cutting of contaminated waste material in radioac- 
tive environments*. These technologies are now being adapted to the 
demanding problem of robotic battery disassembly. At different scales 
of disassembly—pack removal, pack disassembly, module removal and 
cell separation—different challenges and barriers to automation exist. 
Some of these are set out in Fig. 4. Computer-vision algorithms are being 


result in different performance attributes and there are trade-offs and 
compromises with each technology. The cathode chemistries of LIBs havea 
large impact on the performance of LIBs, and these chemistries have evolved 
and improved. Fig. 3 presents asummary of the different LiB cathode 
chemistries. 


developed that can identify diverse waste materials and objects™, reli- 
ably track objects in complex, cluttered scenes*, and dynamically guide 
the actions of robot arms“. Dismantling requires forceful interaction 
between robots and objects, engendering complex dynamics and con- 
trol problems, suchas simultaneous force and motion control*, which 
is needed for robotic cutting or unscrewing. Dismantled materials must 
be grasped and manipulated, including fragmented or deformable 
materials, which pose challenges both to vision systems and autono- 
mous grasp planners. Adjigble et al.** have recently demonstrated 
state-of-the-art performance in autonomous, vision-guided robotic 
grasping of arbitrary objects from random, cluttered heaps. These 
advances in computer vision, artificial intelligence and robotics funda- 
mentals offer exceptionally promising tools with which to approach the 
extremely difficult open research challenge of automated disassembly 
of electric-vehicle batteries. 


Stabilization and passivation of end-of-life batteries 

Once LIBs have been designated for recycling, the three main processes 
involved consist of stabilization, opening and separation, which may 
be carried out separately or together. Stabilization of the LIB can be 
achieved through brine or Ohmic discharge. In-process stabilization 
during opening, however, is the current route preferred in industry, as 
it minimizes costs. This consists of shredding or crushing the batteries 
in an inert gas suchas nitrogen, carbon dioxide, or amixture of carbon 
dioxide and argon. State-of-the art physical processing of LIBs in Europe 
and North America includes the Recupyl$ (France), Akkuser* (Finland), 
Duesenfeld*® (Germany) and Retriev” (USA/Canada) processes. Large- 
scale European processes do not currently use stabilization techniques 
prior to breaking cells open, instead opting for opening under aninert 
atmosphere of carbon dioxide or argon (with less than 4% molecular 
oxygen). Opening under carbon dioxide allows for the formation of a 
passivating layer of lithium carbonate on any exposed lithium metal. 
The Retriev process differs from the European processes in that it uses 
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Fig. 4 | Diagram showing challenges of disassembly at different levels of 
scale. Electric-vehicle battery packs are complex in design, containing wiring 
looms, bus bars, electronics, modules, cells and other components. There are 


Recovered materials 
(depending on cell chemistry 
and recycling process) 


© Cobalt e Manganese 
¢ Nickel ¢ Aluminium 
e Lithium e Plastics 

© Graphite 


a water spray during the opening step”. The water hydrolyses any 
exposed lithium and acts as a heatsink, preventing thermal runaway 
during opening. 

Discharging through salt solutions or ‘brine’ (seawater has been used 
previously*”**) is an alternative method that is supposed to render the 
cells safe via the corrosion and subsequent water leaching into the cells 
that passivates the internal cell chemistries. Aqueous solutions of halide 
salts have been shown to result in substantial corrosion at the battery 
terminal ends, whereas alkali metal salts, such as sodium phosphate, 
produce much less corrosion with no water penetration, offering the 
possibility that cells could be assessed and re-used®. This represents a 
considerably safer discharging method than using seawater; however, 
competing electrochemical reactions do occur. Oxygen, hydrogen and 
other gases, such as chlorine (depending upon the salts in the brine), 
will form at the anode and cathode terminals, and can potentially be col- 
lected, though the dangers and difficulties associated with this should 
not be underestimated. The time for complete discharge is dependent 
onthe solubility of the salt and hence the conductivity of the solution; 
increasing the temperature will also shorten the discharge time. Once 
discharge is complete, the cell components can be separated into dif- 
ferent materials streams for further processing: steel can or laminated 
aluminium, separator, anode (graphite, copper, conductive additive), 
binder and cathode (active material, aluminium, carbon black, binder). 

The brine discharge method is not suitable for high-voltage modules 
and packs, owing to the high rate of electrolysis and vigorous evolution 
of gases that would occur. However, for low-voltage modules and cells 
(or oncea high-voltage pack has been dismantled into its constituent 
components) where the electrolysis can be more carefully controlled, 
this could, in principle, offer a method of discharge in which the hydro- 
gen and oxygen could be recovered for other applications, adding to 


80 | Nature | Vol575 | 7 November 2019 


¢ Variety of vehicle shapes and sizes 
¢ Different pack configurations and locations 
Different fixings and tooling required 


¢ Bolts and fixings may be rusted 

¢ Heads of fixings may be rounded or sheared 
¢ Position of bolt heads not always fixed 

¢ Vehicle bodywork may be distorted 

¢ Vehicle may be crash damaged 

¢ Weight of battery 


¢ Removal of wiring looms tricky 

¢ Manipulation of connectors (especially where locking tabs fitted) 

¢ High voltages until wiring loom/module links removed 

e Lack of data on module condition in many electric-vehicle batteries 
¢ Lack of labelling and identifying marks 

¢ Potential fire hazards 

¢ Potential offgassing of HF 


© Sealants may be used in module manufacture (difficult to remove) 

* Cells stuck together in modules with adhesives (difficult to separate) 
¢ Components may be soldered together (difficult to separate) 

¢ Module state of charge may not be known 


¢ Clean separation of anodes and cathode for direct recycling difficult 
¢ Very finely powdered materials present risks (nanoparticles) 

¢ Potential for HF compounds formed from electrolyte 

¢ Potential for thermal effects if cells shorted during disassembly 

© Chemistries not always known or may be proprietary 

¢ Additional challenges with cylindrical cells (unwinding spiral) 


also many different types of fixtures and fastenings, including screws, bolts, 
adhesives, sealants and solders, which are not designed for robotic removal. 


the cost-effectiveness of the process™. The downside, however, is that 
contamination of the cell contents threatens to complicate the down- 
stream chemical processes or compromise the value of processed 
materials streams. 

Analternative to the use of salt solutions is direct Ohmic discharge 
of the battery through a load-bearing circuit. If the electricity can be 
reclaimed from the discharge, this could offset some of the cost of fur- 
ther processing. To put it into context, the domestic consumption of 
astandard UK homeis up to 4,600 kWh per year. So a 60-kWh battery 
pack at a 50% state of charge and a 75% state of health has a potential 
22.5 kWh for end-of-life reclamation, which would power aUK home for 
nearly 2 hours. At 14.3 p per kWh, this equates to UK£3.22 per pack, which 
may seem a modest gain that does not warrant the cost of investing in 
equipment. However, if it is unrecovered, the energy from discharge 
must be dissipated, and this will add to the cooling burden of the facil- 
ity, creating additional costs. Furthermore, an economy of scale is to be 
anticipated when recycling electric vehicle batteries in bulk. Similarly, 
reclaimed energy might make a useful contribution to the profitability 
of repurposing for second use (see section ‘Battery assessment and 
disassembly’). 

LIB cells can be shredded at various states of charge, and from acom- 
mercial point of view, if discharged modules or cells are to be processed 
in this way, discharge prior to shredding adds cost to the processes. 
Furthermore, exactly what the optimum level of discharge might be 
remains unclear. Depending on cell chemistry and depth of discharge, 
over-discharging of cells can result in copper dissolution into the elec- 
trolyte. The presence of this copper is detrimental for materials recla- 
mation as it may then contaminate all the different materials streams, 
including the cathode and separator. If the voltage is then increased 
again or ‘normal’ operation resumed®, this can be dangerous because 


copper can reprecipitate throughout the cell, increasing the risks of 
short-circuiting and thermal runaway. 

Current LIB-processing technologies essentially bypass these 
concerns by feeding end-of-life batteries directly into a shredder or 
high-temperature reactor. Industrial comminution technologies can 
passivate batteries directly but recovered battery materials then require 
a complex set of physical and chemical processes to produce usable 
materials streams. Pyrometallurgical recycling processes (see section 
‘Stabilization and passivation of end-of-life batteries’) at scale may be 
able to accept entire electric-vehicle modules without further disas- 
sembly. However, this solution fails to capture much of the embodied 
energy that goes into LIB manufacture, and leaves chemical separation 
techniques with much to doas the battery materials become ever more 
intimately mixed. 


Recycling methods 
Pyrometallurgical recovery 
Pyrometallurgical metals reclamation uses a high-temperature furnace 
to reduce the component metal oxides to an alloy of Co, Cu, Fe and Ni. 
The high temperatures involved mean that the batteries are ‘smelted’, 
andthe process, whichis a natural progression from those used for other 
types of batteries, is already established commercially for consumer 
LIBs. It is particularly advantageous for the recycling of general con- 
sumer LIBs, which currently tends to be geared towards an imperfectly 
sorted feedstock of cells (indeed, the batteries can be processed along 
with other types of waste to improve the thermodynamics and products 
obtained), and this versatility is also valuable with respect to electric- 
vehicle LIBs. As the metal current collectors aid the smelting process”, 
the technique has the important advantage that it can be used with 
whole cells or modules, without the need for a prior passivation step. 
The products of the pyrometallurgical process are a metallic alloy 
fraction, slag and gases. The gaseous products produced at lower tem- 
peratures (<150 °C) comprise volatile organics from the electrolyte and 
binder components. At higher temperatures the polymers decompose 
and burn off. The metal alloy can be separated through hydrometallur- 
gical processes (see section ‘Hydrometallurgical metals reclamation’) 
into the component metals, and the slag typically contains the metals 
aluminium, manganese and lithium, which can be reclaimed by further 
hydrometallurgical processing, but can alternatively be used in other 
industries such as the cement industry. There is relatively little safety 
risk in this process, as the cells and modules are all taken to extreme 
temperatures witha reductant for metal reclamation—aluminium from 
the electrode foils and packaging is a major contributor here—so the 
hazards are contained within the processing. In addition, the burning of 
the electrolytes and plastics is exothermic and reduce the energy con- 
sumption required for the process. It follows that in the pyrometallurgi- 
cal process there is typically no consideration given to the reclamation 
of the electrolytes and the plastics (approximately 40-50 per cent of the 
battery weight) or other components suchas the lithium salts. Despite 
environmental drawbacks (suchas the production of toxic gases, which 
must be captured or remediated and the requirement for hydrometal- 
lurgical post-processing), high energy costs, and the limited number 
of materials reclaimed, this remains a frequently used process for the 
extraction of high-value transition metals such as cobalt and nickel”. 


Physical materials separation 

For reclamation after comminution, recovered materials can be sub- 
jected to arange of physical separation processes that exploit variations 
in properties suchas particle size, density, ferromagnetism and hydro- 
phobicity. These processes include sieves, filters, magnets, shaker tables 
andheavy media, used to separate a mixture of lithium-rich solution, low- 
density plastics and papers, magnetic casings, coated electrodes and 
electrode powders. The result is generally a concentration of electrode 
coatings in the fine fractions of material, and a concentration of plastics, 


casing materials, and metal foils in the coarse fractions’. The coarse 
fractions can be put through magnetic separation processes to remove 
magnetic material such as steel casings and density separation pro- 
cesses to separate plastics from foils. The fine product is referred to as 
the ‘black mass’, and comprises the electrode coatings (metal oxides 
and carbon). The carbon can be separated from metal oxides by froth 
flotation, which exploits the hydrophobicity of carbon to separate it 
fromthe more hydrophilic metal oxides. An overview of how these pro- 
cesses are used by several companies is shown in Fig. 5, which mentions 
the Recupyl® (France), Akkuser” (Finland), Duesenfeld*° (Germany) 
and Retriev™ (USA/Canada) processes. 

Often, the polymeric binders from the ‘black mass’ components need 
to be eliminated to liberate the graphite and metal oxides from the cop- 
per and aluminium current collectors. Published routes include the 
use of sonication in a solvent such as N-methyl-2-pyrrolidone (NMP) 
or dimethylformamide (DMF) to detach the cathode fromthe current 
collector®, thermal heat treatment to decompose the binder®™, or dis- 
solution of the aluminium current collector®. These processes, however, 
often require high temperatures (60-100 °C) and are relatively slow 
(3h). While ultrasound can induce faster delamination (1.5 h), this is 
still too slow for a continuous-flow process and the required solvent- 
to-solid mass ratios of 10:1 will not be viable ona commercial scale with 
these solvents™. 

Recent teardowns of cells indicate that manufacturers are transition- 
ing away from fluorinated binders. Many newer batteries are moving 
toward alternative binders on the anode, such as carboxymethyl cel- 
lulose (CMC), which is water-soluble, and styrene butadiene rubber 
(SBR), whichis not water-soluble but is applied as an emulsion that may 
be easier to remove at end-of-life. There is also work on water-based 
binder systems for cathodes, but this is proving to be more challenging. 
Other studies have used cellulose- and lignin-based binders, although 
many of these are still in the laboratory testing phase®. 


Hydrometallurgical metals reclamation 

Hydrometallurgical treatments involve the use of aqueous solutions to 
leachthe desired metals from cathode material. By far the most common 
combination of reagents reported is H,SO,/H,0O, (ref. °°). Anumber of 
studies have been carried out in order to determine the most efficient 
set of conditions to achieve an optimal leaching rate. These include: 
concentration of leaching acid, time, temperature of solution, the solid- 
to-liquid ratio and the addition of a reducing agent”. In most of these 
studies, it was found that leaching efficiency improved when H,O, was 
added. Somewhat counterintuitively, it is understood that H,O, acts 
as a reducing agent to convert insoluble Co(III) materials into soluble 
Co(II) through the reaction’: 


2LiCoO,(s)+3H,SO, +H,O, > 2CoSO,(aq)+Li,SO, +4H,0 + O, 


A range of other possible leaching acids and reducing agents have 
been investigated®* ”. The leached solution may also subsequently 
be treated with an organic solvent to performa solvent extraction”. 
Once leached, the metals may be recovered through a number of pre- 
cipitation reactions controlled by manipulating the pH of the solution. 
Cobalt is usually extracted either as the sulfate, oxalate, hydroxide or 
carbonate” ”, and then lithium can be extracted througha precipitation 
reaction forming Li,CO, or Li,PO,°°*!. An alternative recycling method 
describes mechanochemical treatment of materials, where electrode 
materials are ground witha chlorine compound or complexing agent 
to produce water-soluble salts of cobalt, which can be separated from 
insoluble fractions by washing with water®*’, 

Most current recycling processes fall under the umbrella of ‘reagent 
recovery’ because the materials, with sufficient purity, can be re-used 
not just for resynthesizing the original cathode materials, but alsoina 
range of other applications, suchas the synthesis of CoFe,O, or MnCo,0, 
(refs. 848°), Following initial work focused on the leaching and 
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Fig. 5| Flow chart representing potential routes for the circular economy of 
LIBs, detailing second-use applications, re-use, physical recovery, chemical 
recovery and biorecovery. A range of commercial entities have 


remanufacture of LiCoO, (ref. 8”), work has since moved on to strate- 
gies for newcell chemistries, which typically contain multiple transition 
metals (for example, LiNi,_,-,Mn,Co,O,; NMC). In such cases, once the 
metals have been leached from the cathode material, either sequential 
precipitationis employed to recover the individual metals, or the direct 
remanufacture of the cathode is targeted, such as work to recover NMC®. 
Inthis work, after leaching the metals from the cathode, the concentra- 
tions of the various metals in solution were measured and adjusted to 
match those in the target material (1:1:1 Ni:Mn:Co for NMC-111). Thesame 
group has applied the technique to NMC with varying metal contents and 
successfully resynthesized such NMC materials through the production 
ofa precursor hydroxide, Ni,Mn,Co(OH), withx, yandzvarying accord- 
ing to the desired final composition of the cathode. 

Other groups have published similar recovery methods with modifica- 
tions suchas additional solvent extraction steps”, lactic acid or urea as 
analternative to sulfuric acid (additionally facilitating resynthesis)”’” as 
well as investigating the effect of magnesium in the resynthesized mate- 
rial??. The big issues to be addressed with all solvo-metallurgical pro- 
cesses are the volumes of solvents required, the speed of delamination, 
the costs of neutralization and the likelihood of cross-contamination of 
materials. Although shredding isa fast and efficient method of rendering 
the battery materials safe, mixing the anode and cathode materials at 
the start of the recycling process complicates downstream processing. 
A method in which anode and cathode assemblies could be separated 
prior to mechanical or solvent-based separation would greatly improve 
material segregation. This is one of several key areas where designing 
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commercialized processes for recycling LIBs. Different approaches for the 
physical separation of batteries and the recovery of materials are indicated. 


for end-of-life recycling promises to havea real impact, but the historic 
backlog of batteries containing polyvinylidene fluoride (PVDF) asa 
binder will still need to be processed. It is clear that the current design 
of cells makes recycling extremely complex and neither hydro- nor 
pyrometallurgy currently provides routes that lead to pure streams of 
material that can easily be fed into a closed-loop system for batteries. 


Direct recycling 
The removal of cathode or anode material from the electrode for 
reconditioning and re-use in a remanufactured LIB is known as direct 
recycling. In principle, mixed metal-oxide cathode materials can be 
reincorporated into a new cathode electrode with minimal changes to 
the crystal morphology of the active material. In general, this will require 
the lithium content to be replenished to compensate for losses due to 
degradation of the material during battery use and because materials 
may not be recovered from batteries in the fully discharged state with 
the cathodes fully lithiated. So far, work in this area has focused primarily 
onlaptop and mobile phone batteries, as a result of the larger amounts 
of these available for recycling®®. An example of how this recycling route 
could work has been outlined recently”. Cathode strips, obtained after 
dismantling spent batteries, were soaked in NMP before undergoing 
sonication. Powders were either regenerated through simple solid-state 
synthesis with the addition of fresh Li,CO, or treated hydrothermally 
with a solution containing LiOH/Li,SO, before annealing. 

For high-cobalt cathodes suchas lithium cobalt oxide (LCO) conven- 
tional pyrometallurgical (see section ‘Pyrometallurgical recovery’) or 
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Fig. 6 |Comparison of different LiB recycling methods. 


hydrometallurgical (see section ‘Hydrometallurgical recovery’) recy- 
cling processes can recover around 70% of the cathode value”. However, 
for other cathode chemistries that are not as cobalt-rich, this figure 
drops notably”. A 2019 648-Ib Nissan Leaf battery, for example, costs 
US$6,500-8,500 new, but the value of the pure metals in the cathode 
material is less than US$400 and the cost of the equivalent amount of 
NMC (analternative cathode material) is in the region of US$4,000. It 
is important, therefore, to appreciate that cathode material must be 
directly recycled (or upcycled) to recover sufficient value. As direct 
recycling avoids lengthy and expensive purification steps, it could be 
particularly advantageous for lower-value cathodes such as LiMn,0, 
and LiFePO,, where manufacturing of the cathode oxides is the major 
contributor to cathode costs, embedded energy and carbon dioxide 
footprint®. 

Direct recycling also has the advantage that, in principle, all battery 
components” can be recovered and re-used after further processing 
(with the exclusion of separators). Although there is substantial litera- 
ture regarding the recycling of the cathode component from spent LIBs, 
research onrecycling of the graphitic anode is limited, owing to its lower 
recovery value. Nevertheless, the successful re-use of mechanically 
separated graphite anodes from spent batteries has been demonstrated, 
with similar properties to that of pristine graphite”®. 

Despite the potential advantages of direct recycling, however, consid- 
erable obstacles remain to be overcome before it can become a practical 
reality. The efficiency of direct recycling processes is correlated with the 
state of health of the battery and may not be advantageous where the 
state of charge is low”. There are also potential issues with the flexibility 
of these routes to handle metal oxides of different compositions. For 
maximum efficiency, direct recycling processes must be tailored to spe- 
cific cathode formulations, necessitating different processes for differ- 
ent cathode materials”. The ten or so years spent ina vehicle—followed, 
perhaps, bya few moreinasecond-use application—therefore present 
achallengein anindustry where battery formulations are evolving ata 
rapid pace. Direct recycling may struggle to accommodate feedstocks 
of unknown or poorly characterized provenance, and there will be com- 
mercial reluctance to re-use material if product quality is affected. 

The direct recycling route for cathode coatings is also highly sensitive 
to contamination by other metals, suchas aluminium, which results in 
poor electrochemical performance“. In particular, methods of recover- 
ing materials for further physical or chemical separation that involvea 
high degree of comminution form fine particles of Aland Cu, whichare 
difficult to separate from the electrode coatings. For this reason, pro- 
cesses that do not mechanically stress the electrode foils are favoured 
in direct recycling, and separation of the materials streams prior to 
mechanical sorting is preferable. However, methods of removing the 
electrode binder—typically pyrolysis or dissolution—present further 
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challenges, such as the production of hazardous byproducts such as 
HF from pyrolysis of the PVDF binder or the use of the highly toxic 
NMP as asolvent for dissolution. The potential for the undesirable 
reaction of the PVDF binder with the electrode material appears to 
be anotable omission in the recycling literature, despite a growing 
body of research illustrating that PVDF is an excellent low-temperature 
fluorinating reagent for metal oxides’. Furthermore, recent research 
suggests that a certain degree of reaction can occur with the cathode 
even under conditions of normal cell operation”. 


Biological metals reclamation 

Bioleaching, in which bacteria are harnessed to recover valuable met- 
als, has been used successfully in the mining industry’°°™™. This is an 
emerging technology for LIB recycling and metal reclamation and is 
potentially complementary to the hydrometallurgical and pyromet- 
allurgical processes currently used for metal extraction’””"’; cobalt 
and nickel, in particular, are difficult to separate and require additional 
solvent-extraction steps. The process uses microorganisms to digest 
metal oxides from the cathode selectively'™ and to reduce these oxides 
to produce metal nanoparticles’*"°*. The number of studies that have 
been performed thus far, however, is relatively small and there is plenty 
of opportunity for further investigation in this field. The recycling meth- 
ods discussed are compared in Fig. 6. 


Summary and opportunities 


The electric-vehicle revolution is set to change the automotive industry 
radically, and some of the most profound changes will inevitably relate 
tothe management and decommissioning of vehicles at end-of-life. Of 
chief concern are the complex, high-tech power trains and, in particular, 
the LIBs. To put this into perspective, electrification of only 2% of the 
current global car fleet would represent a line of cars—and in due course, 
of end-of-life waste—that could stretch around the Earth. There is wide 
acceptance that, for environmental and safety reasons, stockpiling (or 
worse, landfill) and wholesale transport of end-of-life electric-vehicle 
batteries are not attractive options, and that the management of end- 
of-life electric-vehicle waste will require regional solutions. 

Inthe waste management hierarchy, re-use is considered preferable 
torecycling, in order to extract maximum economic value and minimize 
environmental impacts. Many companies in various parts of the world 
are already piloting the second use of electric-vehicle LIBs for a range 
of energy storage applications. Advanced sensors and improved meth- 
ods of monitoring batteries in the field and end-of-life testing would 
enable the characteristics of individual end-of-life batteries to be bet- 
ter matched to proposed second-use applications, with concomitant 
advantages in lifetime, safety and market value. Evenifall the benefits of 
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second-use are realized, however, it must be remembered that recycling 
(if not landfill) is the inevitable fate of all batteries. 

Some recent life-cycle analyses has indicated that the application of 
current recycling processes to the present generation of electric-vehicle 
LIBs may not in all cases result in reductions in greenhouse gas emis- 
sions compared to primary production’. More efficient processes are 
urgently needed to improve both the environmental and economic 
viability of recycling, which at present is heavily dependent on cobalt 
content. However, as the amount of cobalt in cathodes is reduced for 
economicand other reasons, torecycle using currentmethods willbecome 
less advantageous owing to the lower value of the materials recovered. 

At present, there are low volumes of electric-vehicle batteries that 
require recycling. As these volumes increase dramatically, there are 
questions concerning the economies (and diseconomies) of scale in 
relation to recycling operations®’. Pyrometallurgical routes, in par- 
ticular, suffer from high capital costs, and if full recyclability of LIBs is 
to be achieved, alternative methods are urgently required, rather than 
seeking to recycle only the most economically valuable components. 

There are anumber of lessons that the future LIB recycling industry 
could learn fromthe highly successful lead-acid battery recycling indus- 
try. As atechnology, lead-acid batteries are relatively standardized and 
simple to disassemble and recycle, which minimizes costs, allowing the 
value of lead to drive recycling. Unfortunately, for a rapidly developing 
technology suchas electric-vehicle LIBs, such advantages are not likely 
to apply any time soon. 

Anumber ofimprovements could make electric-vehicle LIB recycling 
processes economically more efficient”, such as better sorting tech- 
nologies, a method for separating electrode materials, greater process 
flexibility, design for recycling, and greater manufacturer standardiza- 
tion of batteries. There is a clear opportunity for a more sophisticated 
approach to battery recovery through automated disassembly, smart 
segregation of different batteries and the intelligent characterization, 
evaluation and ‘triage’ of used batteries into streams for remanufacture, 
re-use and recycling. The potential benefits of this are many and include 
reduced costs, higher value of recovered material streams, and the near 
elimination of the risk of harm to human workers. 

The design of current battery packsis not optimized for easy disassem- 
bly. Use of adhesives, bonding methods and fixtures do not lend them- 
selves to easy deconstruction either by hand or machine. All reported 
current commercial physical cell-breaking processes employ shredding 
or milling with subsequent sorting of the component materials. This 
makes the separation of the components more difficult than if they 
were presorted and considerably reduces the economic value of waste 
material streams. Many of the challenges this presents to remanufac- 
ture, re-use and recycling could be addressed if considered early inthe 
design process. 

For direct recycling where purity of the recovered materials is 
required, a process which involves less component contamination dur- 
ing the breaking stage is important. This would benefit from an analysis 
of the cell component chemistries, and the state of charge and state of 
health of the cells before disassembly into the component parts, rather 
than the production ofa mixture of all components. At present, this sepa- 
ration has only been performed ata laboratory scale and usually employs 
manual disassembly methods that are difficult to scale up economically. 
The move to greater automation and robotic disassembly promises to 
overcome some of these hurdles. Issues regarding the binder still need 
tobe resolved, and acid, alkali, solvent and thermal treatments all have 
their positives and negatives. A cell design for reclamation of materials 
is extremely appealing, with low-cost water-soluble binders. 

We have focused here on the scientific challenges of recycling LIBs, 
but we recognize that the ‘system performance’ of the LIB recycling 
industry will be strongly affected by a range of non-technical factors, 
suchas the nature of the collection, transportation, storage and logis- 
tics of LIBs at the end-of-life. As these vary from country to country 
and region to region, it follows that different jurisdictions may arrive 
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at different answers to the problems posed. Research is under way 
in the Faraday Institution ReLiB Project, UK; the ReCell Project, US; at 
CSIRO in Australia and at anumber of European Union projects includ- 
ing ReLieVe, Lithorec and Amplifll. 

Recycling electric-vehicle batteries at end-of-life is essential for many 
reasons. At present there is little hope that profitable processes will be 
found for all types of current and future types of electric-vehicle LIBs 
without substantial successful research and development, sotheimpera- 
tive to recycle will derive primarily from the desire to avoid landfill and 
to secure the supply of strategic elements. The environmental and eco- 
nomic advantages of second-use and the low volume of electric-vehicle 
batteries currently available for recycling could stifle the development 
of arecycling industry in some places. In many nations, the elements 
and materials contained in the batteries are not available, and access to 
resources is crucial in ensuring a stable supply chain. Electric vehicles 
may prove to bea valuable secondary resource for critical materials. 
Careful husbandry of the resources consumed by electric-vehicle battery 
manufacturing—and recycling—surely hold the key to the sustainability 
of the future automotive industry. 
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costs of reducing emissions or removing carbon dioxide from the atmosphere. Here 
we review ten pathways for the utilization of carbon dioxide. Pathways that involve 
chemicals, fuels and microalgae might reduce emissions of carbon dioxide but have 
limited potential for its removal, whereas pathways that involve construction 
materials can both utilize and remove carbon dioxide. Land-based pathways can 
increase agricultural output and remove carbon dioxide. Our assessment suggests 
that each pathway could scale to over 0.5 gigatonnes of carbon dioxide utilization 
annually. However, barriers to implementation remain substantial and resource 
constraints prevent the simultaneous deployment of all pathways. 
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CO, utilization is receiving increasing interest from the scientific com- 
munity’. This is partly due to climate change considerations and partly 
because using CO, as a feedstock can result in a cheaper or cleaner 
production process compared with using conventional hydrocarbons’. 
CO, utilization is often promoted as a way to reduce the net costs—or 
increase the profits—of reducing emissions or removing carbon diox- 
ide from the atmosphere, and therefore as a way to aid the scaling of 
mitigation or removal efforts’. CO, utilization is also seen variously as 
a stepping stone towards‘ or a distraction away from’ the successful 
implementation of carbon capture and storage (CCS) at scale. 

In most of the literature—including the IPCC 2005 Special Report on 
Carbon Dioxide Capture and Storage®—the term ‘CO, utilization’ refers 
to the use of CO,, at concentrations above atmospheric levels, directly 
or asa feedstock in industrial or chemical processes, to produce valu- 
able carbon-containing products® “. Included in this conventional defi- 
nition is the industrial production of fuels using, for example, amines 
to capture and concentrate the CO, from air, potentially with solar 
energy. However, the definition excludes cases in which an identical 
fuel is produced from the same essential inputs, but the CO, utilized 
is captured by plant-based photosynthetic processes. 

Here, we consider CO, utilization to be a process in which one or 
more economically valuable products are produced using CO,, whether 
the CO, is supplied from fossil-derived waste gases, captured fromthe 
atmosphere by an industrial process, or—in a departure from most 
(but not all’”’’) of the literature—captured biologically by land-based 


processes. Biological or land-based forms of CO, utilization can gen- 
erate economic value in the form of, for example, wood products for 
buildings, increased plant yields from enhanced soil carbon uptake, 
and even the production of biofuel and bio-derived chemicals. We use 
this broader definition deliberately; by thinking functionally, rather 
than narrowly about specific processes, we hope to promote dialogue 
across scientific fields, compare costs and benefits across pathways, 
and consider common techno-economic characteristics across path- 
ways that could potentially assist in the identification of routes towards 
the mitigation of climate change. 

In this Perspective, we consider a non-exhaustive selection of ten 
CO, utilization pathways and provide a transparent assessment of the 
potential scale and cost for each one. The ten pathways areas follows: 
(1) CO,-based chemical products, including polymers; (2) CO,-based 
fuels; (3) microalgae fuels and other microalgae products; (4) concrete 
building materials; (5) CO, enhanced oil recovery (CO,-EOR); (6) bio- 
energy with carbon capture and storage (BECCS); (7) enhanced weath- 
ering; (8) forestry techniques, including afforestation/reforestation, 
forest management and wood products; (9) land management via soil 
carbon sequestration techniques; and (10) biochar. 

These ten CO, utilization pathways can also be characterized as 
‘cycling’, ‘closed’ and ‘open’ utilization pathways (Fig. 1, Table 1, 
Supplementary Materials). For instance, many (but not all) conven- 
tional industrial utilization pathways—such as CO,-based fuels and 
chemicals—tend to be ‘cycling’: they move carbon through industrial 
systems over timescales of days, weeks or months. Such pathways 
do not provide net CO, removal from the atmosphere, but they can 
reduce emissions via industrial CO, capture that displaces fossil 
fuel use. By contrast, ‘closed’ pathways involve utilization and near- 
permanent CO, storage, such as in the lithosphere (via CO,-EOR or 
BECCS), in the deep ocean (via terrestrial enhanced weathering) 
or in mineralized carbon in the built and natural environments. 
Finally, ‘open’ pathways tend to be based in biological systems, 
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Pathway 
(1) Chemicals from CO, 
(2) Fuels from CO, 

(3) Products from microalgae 
(4) Concrete building materials 
(6) CO,-EOR 

(6) Bioenergy with carbon capture and storage 
(7) Enhanced weathering 

(8) Forestry techniques 

(9) Soil carbon sequestration techniques 

(10) Biochar 


Fig.1| Stocks and net flows of CO, including potential utilization and 
removal pathways. Orange, red and purple arrows (numbered 1-10, as 
described in Table 1) represent cycling, closed and open pathways for CO, 
utilization and removal. Teal block arrows represent annual flows to and from 
the atmosphere, with estimates averaged over the 2008-2017 period’*"!. 
Estimates of stocks inthe Earth’s spheres (lithosphere, biosphere, hydrosphere 
and atmosphere, labelled in bold) and selected stock subcategories are given. 
All estimates are based on IPCC estimates” except where noted, and are 
converted from C to CO,. Carbon stocks in the hydrosphere comprise seawater, 


and are characterized by large removal potentials and storage in 
‘leaky’ natural systems—such as biomass and soil—with the risk of 
large-scale flux back to the atmosphere. 

Of the pathways we discuss, some are novel or emerging—such as 
CO,-fuels, for which current flows are near-zero—whereas others are 
well established, such as CO,-EOR and afforestation/reforestation. 
Pathways were selected on the basis of discussions at a joint meeting 
of the US National Academy of Sciences and the UK Royal Society!; each 
pathway is relatively well studied to date and has an acknowledged 
potential to scale. There are many other pathways that meet our defini- 
tion but are not reviewed here (Supplementary Materials). 

This Perspective is structured as follows: first, the ten utilization path- 
ways are presented in the context of the scale of CO, stocks and flows 
on Earth. Second, the potential scale and economics of each pathway 
are assessed. Third, a selection of key barriers to scaling is identified. 
Fourth, we assess the outlook for CO, utilization, and conclude with 
priorities for future research and policy. 
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sediment, and dissolved organic carbon (not shown, around 2,600 Gt CO,). The 
vast majority of carbonstocks in the lithosphere are locked inthe Earth’s crust”, 
with estimated accessible fossil fuel reserves and resources of more than 
45,000 Gt CO,”». Atmospheric stocks are converted from the 2017 estimates of 
atmospheric CO, of 405 ppm” using a conversion factor of 2.12. Carbon stocks 
in the biosphere include those stored in permafrost and wetlands (not shown, 
around 7,500 Gt CO,), vegetation, and soils. Soil stocks to 1-m depth have been 
recently estimated at 5,500 Gt CO,”. 


CO, utilization and the carbon cycle 


The amount of carbon dioxide that is utilized by a pathway is not nec- 
essarily the same as the amount of carbon dioxide removed or carbon 
dioxide stored. CO, utilization does not necessarily reduce emissions 
and does not necessarily deliver a net climate benefit, once indirect and 
other effects have been accounted for. The various concepts overlap 
and relate to each other, but are distinct (Supplementary Fig. 1, Sup- 
plementary Materials). Some carbon capture and utilization (CCU) 
processes achieve carbon dioxide removal (CDR) fromthe atmosphere, 
and some involve CCS. CCS itself can contribute either to the mitigation 
of CO, (for example, by reducing net emissions froma gas-fired power 
plant) or to atmospheric removals (for example, by direct air carbon 
capture and storage, or DACCS); CCS does not necessarily imply CDR. 
Furthermore, CCS and CDR can fail to deliver a climate benefit. For 
instance, perverse indirect effects—such as land-use change resulting 
from BECCS“—could increase net atmospheric CO, concentrations. 


Table 1| Ten CO, utilization and removal pathways 


Pathway’ 


Removal and/or capture” 


Utilization product 


Storage“ and likelihood 
of release (high/low) 


Emission on use‘ 
or release during 
storage® 


Example cycles" 


(1) Chemicals from Catalytic chemical conversion of | CO,-derived platform Various chemicals (days/ Hydrolysis or CLG; KCLF; ALFJ; 
co, CO, from flue gas or other sources chemicals such as decades) - high decomposition ALG 

into chemical products methanol, urea and 

plastics 

(2) Fuels from CO, Catalytic hydrogenation CO,-derived fuels such Various fuels (weeks/ Combustion CLG; ALG 

processes to convert CO, from as methanol, methane months) - high 

flue gas or other sources into fuels and Fischer-Tropsch- 

derived fuels 

(3) Products from Uptake of CO, from the Biofuels, biomass, or Various products (weeks/ Combustion (fuel) CLG; BG 
microalgae atmosphere or other sources by bioproducts such as months) - high or consumption 

microalgae biomass aquaculture feed (bioproduct) 
(4) Concrete building Mineralization of CO, from flue Carbonated aggregates | Carbonates (centuries) Extreme acid CLF; ALF 
materials gas or other sources into industrial or concrete products -low conditions 

waste materials, and CO, curing 

of concrete 
(5) CO,-EOR Injection of CO, from flue gas or Oil Geological sequestration _n.a. CD 

other sources into oil reservoirs (millennia) - low® 
(6) Bioenergy with Growth of plant biomass Bioenergy crop biomass Geological sequestration _n.a. BCD 
carbon capture and (millennia) - low® 
storage (BECCS) 
(7) Enhanced Mineralization of atmospheric CO, Agricultural crop Aqueous carbonate Extreme acidic BE 
weathering via the application of pulverized biomass (centuries) - low conditions 

silicate rock to cropland, 

grassland and forests 
(8) Forestry Growth of woody biomass via Standing biomass, wood Standing forests and Disturbance, BFJ 
techniques afforestation, reforestation or products long-lived wood products combustion or 

sustainable forest management (decades to centuries) decomposition 

-high 

(9) Soil carbon Increase in soil organic carbon Agricultural crop Soil organic carbon (years Disturbance or BFJ 
sequestration content via various land biomass to decades) - high decomposition 
techniques management practices 
(10) Biochar Growth of plant biomass for Agricultural or bioenergy Black carbon (years to Decomposition BFJ 


pyrolysis and application of char 
to soils 


crop biomass 


decades) - high 


n.a., not applicable. 

*The ten pathways are depicted in Fig. 1 and are represented as a combination of steps in Fig. 2. 
’Removal and/or capture corresponds to steps A, B and/or C in Fig. 2. 

°Storage corresponds to steps D, E or F in Fig. 2. 

d 


more”. This is also relevant only for conventional operations. 
e 
‘Emission on use corresponds to step G in Fig. 2. 

°Release during storage corresponds to steps H, | or J in Fig. 2. 

"The letters stated are the steps from Fig. 2 that comprise the example cycle. 


‘Storage durations represent best-case scenarios. For instance, in CO,-EOR, if the well is operated with complete recycle, the CO, is trapped and can be stored on a timescale of centuries or 


Release during geological storage is usually a consequence of engineering implementation error. 


CO, utilization does not necessarily contribute to addressing climate 
change, and careful analysis is essential to determine its overall impact. 
Identifying the counterfactual—what would have happened without 
CO, utilization—is important but is often particularly challenging, 
and theimpact ofa given CO, utilization pathway on the mitigation of 
climate change varies as a function of space and time (Box 1). 

For CO, utilization to contribute usefully to the reduction of atmos- 
pheric CO, concentrations, the scale of the pathways must be mean- 
ingful in comparison with the net flows of CO, shown in Fig. 1. The flux 
of carbon from fossil fuels and industry to the atmosphere (34 Gt CO, 
yr')* is dwarfed by the gross flux to land via photosynthesis in plants 
(440 Gt CO, yr)"*. However, only 2%-3% of this photosynthetic carbon 
remains on land (12 Gt CO, yr’), and only for decades; the remainder is 
re-emitted by plant and soil respiration. If soil carbon uptake could be 
increased by 0.4% per year, this would contribute to achieving net zero 
emissions—as per the “4 per mille’ initiative’ —but this is challenging’®. Of 
the ten pathways we discuss, five leverage our ability to perturb these 
land-based fluxes. 


The other five conventional industrial CO, utilization pathways could 
also perturb the net flows of CO,. The production of plastics and other 
products creates a demand for so-called ‘socioeconomic carbon’” 
(around 2.4 Gt CO, yr“, of which around two-thirds is wood products) 
that could be met in part through CO, utilization. The total stock of 
carbon accumulated in products (such as wood products, bitumen, 
plastic and cereals) has been estimated at 42 Gt CO, in 2008, of which 
25 Gt CO, is in wood products”. Up to 16 Gt CO, was sequestered in 
human infrastructure as mineralized carbonates in cement between 
1930 and 2013, with current rates”°” estimated to bearound1GtCO, yr1. 

The flow of CO, through the different utilization pathways can be 
represented by a combination of different steps (labels A to L; Fig. 2, 
Table 1). Utilization pathways often (but not always) involve removal 
(A or B) and storage (D, E or F); however, the permanence of CO, stor- 
age varies greatly from one utilization pathway to another, with stor- 
age timeframes ranging from days to millennia. In part, permanence 
depends upon where the carbon ends up (Fig. 1): the lithosphere, by 
geological sequestration into reservoirs such as saline aquifers or 
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<— Forestry pathway 


Fig. 2 | Carbon dioxide utilization and removal cycle. Utilization pathways are 
represented as a combination of steps, A-L. Green arrows trace an example open 
pathway, forestry (BFJ). Red arrows trace an example cycling pathway, CO, fuels 
with direct air capture (ALG). Blue arrows trace an example closed pathway, 


depleted oil and gas reservoirs, or by mineralization into rocks; the 
biosphere, in trees, soils and the human-built environment; or the 
hydrosphere, with storage in the deep oceans. Geological storage, 
when executed correctly, is considered to be more permanent” than 
storage in the biosphere, which is shorter and subject to human and 
natural disturbances” such as wildfires and pests, as well as changes 
in climate*. However, even ‘closed’ pathways do not offer completely 
permanent storage over geological timescales (more than 100,000 
years”), which gives rise to intergenerational ethical questions”®. 

Inthe short term, the creation of products from concentrated CO,, 
as instep L (albeit, CO, conversion is not anecessary requirement for 
utilization), could leverage the industrial capture of flue gases following 
the extraction and combustion of fossil fuels (KC)”. Inthe longer term, 
the CO, loop will need to be closed in order to achieve net zero emissions, 
implying that CO, will need to be sourced from the atmosphere, poten- 
tially via direct air capture (A) or through land-based uptake by photo- 
synthesis or mineralization (B). For instance, net zero CO,-based fuels 
mustshift the current flows of carbon, froma lithosphere-to-atmosphere 
(KCLG) to an atmosphere-to-atmosphere cycle (ALG) (Fig. 2). 


Scale and economics of CO, utilization 


We assess the peer-reviewed literature on the ten pathways, which 
comprises over 11,000 papers. For the conventional pathways, our 
scoping review covered over 5,000 papers, a minority (186) of which 
provide cost estimates. Estimates of potential scale were informed 
by astructured estimation process and an expert opinion survey. For 
the non-conventional utilization pathways, we build upon existing 
CO, removal estimates (also derived froma scoping review”’ of over 
6,000 papers—of which 927 provide usable estimates—and an expert 
judgement process) and identify preliminary published research onthe 
relationship between CO, removal and CO, utilization to offer estimates 
of the scale and cost of CO, utilization. 

Where possible, we calculate breakeven costs in 2015 US dollars per 
tonne of CO, for each pathway (hereafter, all costs stated are in US 
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CO,-EOR (KCD). Cycling pathways (with the exception of polymers) end with step 
G; closed pathways end with steps D, E or F; and open pathways end with step]. 
See Table 1 for further description. All flows are net of process emissions. 


dollars). The breakeven CO, cost represents the incentive per tonne of 
CO, utilized that would be necessary to make the pathway economic 
(see Supplementary Materials, S1.2). This can be thought of as the break- 
even (theoretical) subsidy per tonne of CO, utilization, although we are 
not recommending sucha subsidy. 


Conventional utilization pathways 

Dependent ona multitude of technological, policy and economic factors 
that remain unresolved, each of the conventional pathways—chemicals, 
fuels, microalgae, building materials and CO,-EOR—might utilize around 
0.5 Gt CO, yr“ or more in 2050. We also estimate that between 0.2 and 
3.2 Gt CO, yr‘ could be removed and stored in the lithosphere or inthe 
biosphere for centuries or more. 


Chemicals. CO, can be transformed efficiently into a range of chemi- 
cals, but only a few of the technologies are economically viable and 
scalable. Some are commercialized”, such as the production of urea” 
and polycarbonate polyols*. Some are technically possible but are not 
widely adopted, suchas the production of CO,-derived methanol inthe 
absence of carbon monoxide* (methanol is a platform chemical fora 
multitude of other reaction pathways, including to fuels, andis mainly 
manufactured via the hydrogenation of a mixture of CO and1%-2% CO,). 
Breakeven costs per tonne of CO,, calculated from the scoping review, 
for urea (around -$100) and for polyols (around —$2,600) reflect that 
these markets are currently profitable. The estimated utilization po- 
tential for CO, in chemicals is around 0.3 to 0.6 Gt CO, yrin 2050, and 
the interquartile range of breakeven costs obtained from the scoping 
review is —$80 to $320 per tonne of CO,. 

Currently, the largest-scale chemical utilization pathway is that of 
urea production. 140 Mt CO, yr ‘is utilized to produce 200 Mt yr? of 
urea *°, Urea is produced from ammonia (which is generated by the 
energy-intensive Haber-Bosch process; 3H, + N, > 2NH;) and CO, 
according to 2NH, + CO, = CO(NH,), + H,O; coal or natural gas typi- 
cally provides the necessary energy. Within days of being applied as 
fertilizer, the carbon in ureais released to the atmosphere. For urea to 


Box 1 


CO, utilization, removal, 
storage, reduced emissions and 
net climate benefit 


Does CO, utilization (CO,u) lead to a climate benefit? It might 
reduce emissions (CO,9), or remove CO, (CO,r) from the 
atmosphere, and/or store it (CO,s). But various direct and indirect 
effects over the relevant life cycle must be considered and 
compared to a plausible baseline or ‘counterfactual’/—what would 
have happened without CO, utilization®*. Assiduously calculating 
direct impacts in one place, and at one time, is of little use if there 
is a ‘waterbed effect’ (also referred to as a ‘rebound’ or ‘leakage’) in 
which emissions occur somewhere else, or later. 

For instance, obtaining a barrel of oil via CO,-EOR utilizes CO,, 
which can remain in the oil formation rather than being re-emitted 
into the atmosphere. Assuming that the CO, does not return to 
atmosphere, the CO, utilized is equal to the CO, emissions stored, 
that is, CO,u = CO,s, but whether CO,r = O depends upon the 
source of the CO,, if it is from a fossil power station, there is no 
net removal of CO, from the atmosphere. Emissions have been 
reduced, and CO, = CO,u = CO,s > O, even though CO,r = 0. 

To visualize this, consider a ‘reference’ scenario in which 1t 
CO, is emitted from a fossil power plant, and 1.5 t CO, are emitted 
from oil use, such that total emissions are 2.5 t CO,. Compare 
this to a ‘utilization’ scenario, in which the CO, from the power 
plant is used for CO,-EOR instead—that is, CO,u = 1t CO,. Total 
emissions in this ‘utilization’ scenario comprise the 1.5 t CO, from 
the consumption of the CO,-EOR oil. Emissions reduced is equal 
to 2.5 - 1.5 =1.0 t CO,p, which is identical to the CO,u, but net 
CO,r = O because the CO, came from a fossil power plant, rather 
than from the atmosphere. 

In reality, the emissions from the baseline barrel of oil that 
was displaced by the CO,-EOR oil might be higher or lower, 
depending on its origin and its production process. If the CO,- 
EOR oil displaces the use of renewable electricity in an electric 
vehicle, CO,-EOR generates a net increase in emissions. If CO,- 
EOR is to offer net removals, the CO, must be captured from the 
atmosphere, and more carbon must be injected into the well than 
is extracted. 

Life-cycle analyses on some industrial CO, utilization pathways 
suggest that the potential for net emission reductions is much 
larger than for net removals, which appears very modest”. Up 
to 3 tonnes of CO, emissions may be avoided for every 1 tonne 
of CO, used in polycarbonate polyols”, even though no CO, is 
removed from atmosphere. Nearly 4 tonnes of CO, emissions 
may be avoided for each tonne of dry wood used that displaces 
concrete-based materials®. 

Other life-cycle analyses have found neutral or negative impacts 
of CO, utilization on reducing emissions”°**. For instance, CO, 
utilization pathways that involve the input of energy from non- 
decarbonized sources may result in net life-cycle increases in 
COse2, 


be net zero carbon, it would require its carbon to be sourced from the 
atmosphere-—for example, using direct air capture—and the energy 
source would need to be renewable. All nitrogen-based fertilizers pro- 
duce N,O, a greenhouse gas that is around 300 times more potent than 
CO, over a 100-year time horizon*. Increasing urea production may 
therefore have a negative impact on climate*®. 


For the production of polymers, the utilization potential of CO, is 
estimated to be 10 to 50 Mt yr ‘in 2050. In the current market structure, 
around 60% of plastics have applications in sectors other than packag- 
ing—including as durable materials for construction, household goods, 
electronics, and in vehicles. Such products have lifespans of decades 
or even centuries*®. 


Fuels and microalgae. Fuels derived from CO, are argued to be anat- 
tractive option inthe decarbonization process*””** because they canbe 
deployed within existing transport infrastructure. Such fuels could also 
find arole insectors that are harder to decarbonize, suchas aviation”, 
since hydrocarbons have energy densities that are orders of magnitude 
above those of present-day batteries”. The long-term use of carbon- 
based energy carriers ina net zero emissions economy relies upon their 
production with renewable energy, and upon low-cost, scalable, clean 
hydrogen production—for example via the electrolysis of water or by 
novel alternative methods. 

Here we consider products such as methanol, methane, dimethyl 
ether, and Fischer-Tropsch fuels as potential CO, energy carriers for 
transportation. The estimated potential for the scale of CO, utilization 
in fuels varies widely, from1to 4.2 Gt CO, yr“, reflecting uncertainties 
in potential market penetration. The high end represents a future in 
which synfuels have sizeable market shares, due to cost reductions and 
policy drivers. The lowend—whichis itself considerable—represents very 
modest penetration into the methane and fuels markets, but it could 
also be an overestimate if CO,-derived products do not become cost- 
competitive with alternative clean energy vectors such as hydrogen or 
ammonia, or with direct sequestration. 

ACO,-to-methanol plant operates in Iceland, and various power-to- 
gas plants operate worldwide. However, these plants represent special 
cases that may be difficult to replicate because they are exploiting geo- 
graphic advantages, suchas the availability of cheap geothermalenergy. 
Although the production of morecomplexhydrocarbonsis energetically 
and therefore economically expensive”, rapid cost-reductions could 
potentially occur if renewable energy—which represents a large propor- 
tion of total cost—continues to become cheaper, and if policy stimulates 
other cost reductions. The US Department of Energy’s target for the cost 
of hydrogen production—$2 per kg of H,—is roughly equivalent to $2 per 
gasoline-gallon equivalent, and would require carbon-free electricity 
tocostless than $0.03 kWh ‘(accounting for kinetics and other losses 
to the enthalpy of electrolysis-based hydrogen production, around 
40 kWh per kgH,)*°. In recent years, several wind and solar power auc- 
tions around the world have been won with prices below” $0.03 kWh. 

The interquartile range for breakeven costs for CO, fuels from our 
scoping review was $0 to $670 per tonne of CO,. Negative breakeven 
costs appear in studies that model particularly beneficial scenarios, 
such as low discount rates, free feedstocks, or free or low-cost renew- 
able electricity. 

For pathways that have high capital costs, the benefits of economies of 
scale and learning could be considerable”. This is particularly relevant 
for the algal pathways thatrequire photobioreactors*® and for the fuel 
synthesis pathways that require electrolysers“*. Microalgae area subject 
of long-standing research interest because of their high CO,-fixation 
efficiencies (up to 10%, compared with 1%-4% for other biomass‘), as 
well as their potential to produce a range of products suchas biofuels, 
high-value carbohydrates and proteins, and plastics**. The microalgae 
pathway has complex production economics and the estimated CO, 
utilization potential for microalgae in 2050 ranges from 0.2 to 0.9 Gt 
CO, yr“, with a breakeven cost interquartile range from the scoping 
review of $230 to $920 per tonne of CO). 


Concrete building materials. CO, utilization pathways in concrete 
building materials are estimated to remove, utilize and store between 
0.1and 1.4 Gt CO, yr‘ over the long term—with the CO, sequestered 
well beyond the lifespan of the infrastructure itself—at interquartile 
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Table 2 | Range estimates of the potential for CO, utilization and present-day breakeven cost 


Pathway Removal potential in 2050 Utilization potential in 2050 Breakeven cost of CO, utilization 
(Mt CO, removed per year) (Mt CO, utilized per year) (2015 US$ per tonne CO, utilized) 

Conventional utilization 

Chemicals Around 10 to 30 300 to 600 -$80 to $320 

Fuels (0) 1,000 to 4,200 $0 to $670 

Microalgae (0) 200 to 900 $230 to $920 

Concrete building materials 100 to 1,400 100 to 1,400 -$30 to $70 

Enhanced oil recovery 100 to 1,800 100 to 1,800 -$60 to -$45 

Non-conventional utilization 

BECCS 500 to 5,000 500 to 5,000 $60 to $160 

Enhanced weathering 2,000 to 4,000 n.d. Less than $200* 

Forestry techniques 500 to 3,600 70 to 1,100 -$40 to $10 

Land management 2,300 to 5,300 900 to 1,900 -$90 to -$20 

Biochar 300 to 2,000 170 to 1,000 -$70 to -$60 


n.d., not determined. 


The breakeven cost is the cost in 2015 US$ per tonne of CO, adjusted for revenues, by-products, and any CO, credits or fees. A breakeven cost of zero represents the point at which the 
pathway is economically viable without governmental CO, pricing (for example, a subsidy for CO, utilization). Breakeven costs presented as a range represent either (for conventional 
pathways with the exception of EOR) 25th and 75th percentile estimates as calculated via the scoping review of the academic literature (in which the magnitude of the difference reflects 
the diversity of technological and economic assumptions available within and across each sub-pathway) or (for land-based pathways) top-down estimates of revenues that may accrue 


(when the uncertainty of the accuracy of the estimation is high). Breakeven costs presented with an asterisk are calculated unadjusted for revenues and by-product credits. To obtain the 


global gross utilization potential high and low values for conventional pathways, we averaged the interpolated expert opinions with an author group estimate. For non-conventional 


utilization pathways, estimated utilization potential ranges are based on estimates of additional realized yield of carbon in vegetation (for soil carbon sequestration and biochar, additional 


yield approximates to net primary productivity, and for afforestation/reforestation, it approximates to wood products). These are first rough estimates based on preliminary but sparse 


published research reporting relationships between carbon storage and additional carbon that can be utilized. 


breakeven costs of -$30 to $70 per tonne of CO, The high end might 
reflect a scenario (amongst other possibilities) in which CO, is used as 
a cement curing agent in the entirety of the precast concrete market 
and in 70% of the pourable cement markets. The estimate also includes 
aggregates that are produced from carbonated industrial wastes, such 
as cement and demolition waste, steel slag, cement kiln dust, and coal 
pulverized fuel ash. 

Cement requires the use of lime (CaO), whichis produced by the cal- 
cination of limestone in an emissions-intensive process. As such, unless 
calcination is paired with carbon capture and sequestration, it is difficult 
for building-related pathways to deliver reductions in CO, emissions on 
alife-cycle basis. Several commercial initiatives aim to replace the lime- 
based ordinary Portland cement—which currently dominates the global 
market—with alternative binders suchas steel-slag based systems* or 
geopolymers made from aluminosilicates”. 


CO,-EOR. Enhanced oil recovery using CO, currently accounts for 
around 5% of the total US crude oil production*’. Conventionally, 
operators aim to maximize both the amount of oil recovered and 
the amount of CO, recovered (rather than CO, stored) per tonne 
of CO, injected; between 1.1 and 3.3 barrels (bbl) of oil can be pro- 
duced per tonne of CO, injected under conventional operation 
and within the constraints of natural reservoir heterogeneity”. 
However, in principle—and depending on operating conditions and 
project type—CO,-EOR can be operated such that, ona life-cycle 
basis, more CO, is injected than is produced upon consumption of 
the final oil product*®. 

More than 90% of the world’s oil reservoirs are potentially suitable 
for CO,-EOR™, which implies that as muchas 140 Gt CO, could be used 
and stored in this way*. We estimate a 2050 utilization rate of around 0.1 
to1.8 Gt CO, yr“. IFEOR was deployed to maximize CO, storage—rather 
than oil output—then genuine CO, emission reductions are possible, 
depending on the emissions intensity of the counterfactual and onthe 
relevant inefficiencies (Box 1). 

At oil prices of approximately $100 bbl“, EORis economically viable 
if CO, can be sourced for between $45 and $60 per tonne of CO,‘?*1, 
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implying a breakeven cost of CO, of -$60 to -$45 per tonne of CO.,. 
These cost estimates (realistically or unrealistically) assume $100 bbI* 
oil prices and are specific to the United States, where the business model 
is mature. 


Non-conventional utilization pathways 

The five non-conventional utilization pathways that we review here are 
BECCS, enhanced weathering, forestry techniques, land management 
practices, and biochar. Previous reviews'**** * have shown that these 
pathways offer substantial CO, removal potential: a recent substantive 
scoping review’ gives values of 0.5 to 3.6 Gt CO, yr for afforestation/ 
reforestation, 2.3 to 5.3 Gt CO, yr“ for land management, 0.3 to2 Gt CO, 
yr for biochar, and 0.5to5 Gt CO, yr for BECCS. Enhanced weathering 
offers aremoval potential of 2to 4 Gt CO, yr ‘at costs” of around $200 
per tonne of CO,. Notall of this potential involves utilization of carbon 
dioxide resulting in economic value, but the approximate scale of CO, 
utilized that is described below could be considerable. The breakeven 
costs per tonne of CO, utilized that we estimate here are low and are 
frequently negative. 


BECCS. BECCS involves the biological capture of atmospheric carbon 
by photosynthetic processes, producing biomass used for the genera- 
tion of electricity or fuel, before CO, is captured and removed. Although 
there is substantial uncertainty regarding the total quantity of available 
biomass**— particularly in light of concerns over competition for land 
use with food crops—100 to 300 EJ yr“ of primary energy equivalent 
of biomass could be deployed by 2050. 

BECCS provides two distinct services: bioenergy, and atmospheric 
CO, removal. Although several cost estimates exist in the literature—for 
example, around $200 per tonne of CO,”*—these typically assign all 
costs to the CO, removal service, and thus implicitly assume that no 
revenue is received for the bioenergy services that are generated. By 
approximating those revenues using a basket of wholesale electricity 
prices across countries that are suited to host BECCS systems”®, we 
estimate breakeven costs of between $60 and $160 per tonne of CO, 
utilized. 


Table 3 | Costs of utilization compared with product costs, scoping review 


Pathway Costof productmade _ Selling price Difference Anticipated cost relative Anticipated direction of cost 
with CO, utilization of product (%) to incumbent in 2050 relative to incumbent in 2050 
(US$ per tonne of (US$ per tonne (summary, expert opinion (summary, expert opinion 
product) Median, of product) survey and author survey and author 
scoping review Present day group judgement) group judgement) 

Polymers 1,440 2,040 -30% Likely to be cheaper Downward 

Methanol 510 400 +30% Insufficient consensus Downward 

Methane 1,740 360 +380% Likely to be more expensive Downward 

Fischer-Tropsch 4,160 1,200 +250% Likely to be more expensive Downward 

fuels 

Dimethyl ether 2,740 660 +320% Insufficient consensus Downward 

Microalgae 2,680 1,000 +170% Likely to be more expensive Insufficient consensus 

Aggregates 21 18 +20% Insufficient consensus Downward 

Cement curing 56 71 -20% Likely to be cheaper Downward 

CO,-EOR n.a. n.a. n.a. Likely to be more expensive Upward 


Median cost estimates for products made with CO, utilization are derived from the backward-looking scoping review. References for the selling prices are set out in more detail in Supplementary 
Table 4. The costs and cost trends anticipated in 2050 are derived from a forward-looking expert opinion survey and from author group judgement. 


Enhanced weathering. The use of terrestrial enhanced weathering 
oncroplands could increase crop yields”. This yield enhancement is 
unlikely to originate directly from increases in soil carbon, but from 
nutrient uptake that is facilitated by pH effects*’. However, under our 
broad definition, there may still be an as-yet-unquantified CO, utiliza- 
tion potential associated with the increase in net primary productivity. 


Forestry techniques. In afforestation/reforestation, atmospheric CO, 
is removed via photosynthesis and the carbon is stored in standing 
forests. Ifused for sustainable forestry, a portion of that carbon enters 
production processes and, after minor energetic losses, becomes wood 
products. Both wood products and standing forests provide economic 
value, and can be seen as CO, utilization (standing forests provide 
ecosystem services, which are not quantified here). The utilization of 
CO, in wood products will occur in addition to the direct removal of 
CO, by forests under certain highly specific circumstances; sustain- 
able harvesting can maintain carbon stocks in forests while providing 
asource of renewable biomass*”. 

We estimate that, of the volumes of CO, sequestered via afforesta- 
tion/reforestation in 2050, between 0.07 and 0.5 Gt of the CO, utilized 
per year may flow into industrial roundwood products, at approximate 
breakeven costs of between —$40 and $10 per tonne of CO, utilized. An 
optimistic scenario might also consider the volumes of wood products 
that are sustainably harvested from existing forests and plantations. 
Yearly inflows of carbon used as wood products are estimated to be 
around 1.8 Gt CO, in 2050. Of these, 0.6 Gt CO, may arise from the por- 
tion of those flows that are industrial roundwood products sustainably 
harvested for use in the construction industry (Supplementary Materi- 
als); this leads to a top-end estimate of 1.1 Gt CO, utilized per year from 
afforestation/reforestation and sustainable forestry techniques. 

Wood products have potential as long-term stores of carbon— 
particularly when used in long-lived buildings, the lifespans of which 
can be conservatively estimated at 80-100 years”. We estimate that 
around half of the carbon in the wood-product pool might continue to 
be stored beyond the usable life of the products (the non-decomposed 
fraction of the portion of total wood products that are presently com- 
mitted to landfill (around 60%) is approximately 77%°°). The remainder 
of the carbon in the wood-product pool will return to the atmosphere 
asa fraction (about 0.5 Gt CO, yr) of the 5 Gt CO, yr land-use change 
flux that is depicted in Fig. 1. 


Soil carbon sequestration and biochar. CO, in land management 
and biochar pathways can be considered to be utilized if it enhances 


economically valuable agricultural output. The CO, taken up by land 
ultimately becomes either CO, utilized (with increased output) or CO, 
removed (stored in soils), but not both. We estimate that around 0.9 to 
1.9 Gt CO, yr“ may be used by soil carbon sequestration techniques on 
croplands and grazing lands by 2050; approximate breakeven costs are 
estimated at between -$90 and -$20 per tonne of CO, utilized, owing to 
yield increases that are associated with increases in soil organic carbon 
stock. We tentatively estimate that approximately 0.2to1Gt CO, yr‘ may 
be utilized via yield increases after the application of biochar on managed 
lands, at approximate breakeven costs of between -$70 and -$60 per 
tonne of CO, utilized. These estimates are based on currently reported 
yield increases (of 0.9% to 2% associated with soil carbon sequestration 
techniques®” and 10% associated with biochar®) from sparse literature, 
using crop production asa proxy for net primary productivity. Impacts on 
yield are likely to be highly variable—for example, according to climatic 
zone™. Crop productivity increases are important not only for economic 
returns for operators but also for land-use requirements. For instance, 
ifthe application of biochar led to an increase in tropical biomass yields 
of 25%, the associated reduction in land requirements would equate to 
185 million hectares, and would result in a cumulative net emission 
benefit from those increased yields of 180 Gt CO, to 2100. 

Table 2 presents breakeven cost ranges and estimated volumes of 
CO, utilized or removed per year in 2050. 


Techno-economic barriers to scaling 


There are numerous challenges in scaling CO, utilization. Here we con- 
sider issues related to cost, technology and energy. Although market 
penetration can be facilitated by cost-competitiveness, there is no 
certainty that the cheapest CO, utilization pathways will scale up. Geo- 
graphical, financing, political and societal considerations are briefly 
addressed in the Supplementary Materials; however, further investi- 
gation of these issues is warranted, particularly with regards to the UN 
Sustainable Development Goals. 


Cost and performance differentials 

The breakeven cost per tonne of CO, is one way to assess the economics 
of utilization. The impact of CO, utilization on the price and value-add 
proposition of the end product is also important, particularly for CO, 
utilization processes in which the final price differential is immaterial 
but small differences in key properties may be important. Prices fora 
fuel product made using CO, currently exceed market prices consider- 
ably (Table 3). 
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Fig. 3 | Estimated CO, utilization potential and breakeven cost of different 
sub-pathways in low and high scenarios. The breakeven cost is the incentive, 
measured in 2015 US$ per tonne of CO,, that is required to make the pathway 
economic. Negative breakeven costs indicate that the pathway is already 
profitable, without any incentive to utilize CO, (such as atax on CO, emissions 
in cases in which utilization avoids emissions, or a subsidy for CO, removed 
from the atmosphere in the case in which utilization removes CO,). Utilization 


Many of the other pathways—in particular those involving products 
in construction and plastics—have economics that are driven not only 
by price but also by the performance characteristics of the end product. 
There may be trade-offs between product quality and mitigation value, 
or synergies between the two. 

Because they are based on a backward-looking scoping review, our 
cost estimates for conventional pathways do not capture current unpub- 
lished innovations and advances in the industrial arena. Our expert opin- 
ionsurvey, which included sources from both academia and industry, 
reflected great uncertainty about future costs. Industry participants 
expressed confidence that costs in pathways that are already economic 
(such as CO, cement curing and polyols) would continue to decrease, 
relative to incumbent product costs. 


Energy requirements 

Some CO, utilization pathways involve chemical transformations that 
require the input of substantial amounts of energy (Supplementary 
Fig. 2). Some require energy to increase CO, concentrations from 
0.04% towards 100%. Life-cycle emissions and costs depend upon 
the source of the energy used. Land-based natural processes use solar 
energy, harnessed by photosynthesis, to transform CO, and water 
into carbohydrates. Although photosynthesis is an inefficient process 
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estimates are based on 2050 projections. Many technologies are inthe very 
early stages of development, and cost optimization via research and 
development could substantially change these estimates. Colour shadings 
reflect the TRLs of the pathways, which again vary markedly within each 
pathway. Asterisks denote the storage duration offered by each pathway: days 
or months (*) decades (**) or centuries or more (***). See Supplementary 
Materials for further details. 


(the average efficiency is around 0.2% globally”) biological pathways 
are not necessarily more expensive. In industrial processes, hydrogen 
often serves as feedstock. At present, ‘brown’ hydrogen is primarily— 
and most cheaply—generated by reforming methane®, which has 
associated CO, emissions. In the production of ‘blue’ hydrogen, these 
emissions are captured and stored. Production of ‘green’ hydrogen—by 
the electrolysis of water— has real potential, and the ultimate choice 
of technology for the generation of hydrogen will depend onthe rates 
of cost reduction”, among other factors. 


The outlook for CO, utilization 

Our high-end and low-end scale and cost estimates in Table 2 are 
drawnas cost curves in low and high scenarios in Fig. 3. These curves 
are constructed using currently available (and often sparse) data in 
the peer-reviewed literature, or—where data are not available—using 
approximations, and should be considered as a speculative first pass 
at envisioning future scenarios. The curves should not be interpreted 
as comprehensive assessments of costs, they do not represent nth-of- 
a-kind costs, and they are incompatible with other sequestration or 
abatement cost curves. The limitations of cost curves—particularly 
with regards to exogenous costs such as establishment costs—have 


been previously described”, and they remain relevant here. Animpor- 
tant caveat is that individual potentials cannot be arbitrarily summed: 
some access the same demand, for instance for transport, which may 
or may not be filled by a process that utilizes CO,. For instance, the 
putative success of CO,-fuels may reduce the demand for oil, thus 
also reducing the potential of CO,-EOR. Furthermore, land availability 
means that choosing one land-based pathway (for example, BECCS) 
might preclude the application of another at scale (for example, 
biochar). 

Notwithstanding the many caveats, the potential scale of utiliza- 
tion could be considerable. Much of this potential CO, utilization— 
notably in ‘closed’ and ‘open’ pathways—may be economically viable 
without substantial shifts in prices. The specific assumptions of the 
low scenario, which do not account for potential overlaps in utiliza- 
tion volumes between pathways, imply an upper bound of over 1.5 Gt 
CO, yr ‘at well under $100 per tonne of CO, utilized. For policymakers 
that are interested in climate change, these figures demonstrate the 
theoretical potential for correctly designed policies to incentivize the 
displacement of fossil fuels or the removal of CO, fromthe atmosphere. 

Figure 3 also highlights some of the economic and technological chal- 
lenges that are faced by these pathways. The cycling pathways (other 
than the production of urea and polyols) must compete with lower-cost 
incumbents. The four closed pathways, except for CO,-EOR, are mainly 
at low technology readiness levels (TRLs). Open pathways, although 
both theoretically profitable and implementable, often incur additional 
operating costs—such as implementation, transaction, institutional, 
and monitoring costs—which can be high”. 

Each of the potentially large-scale, low-cost pathways also face chal- 
lenges as mitigation strategies. CO,-EOR utilizes and, with correct pol- 
icy, stores CO, at scale, but may not yield any net climate benefit and 
may even be detrimental. BECCS has a range of well-articulated risks, 
including considerable increases in emissions as a result of land-use 
change”. Land management, biochar and forestry offer only shorter- 
term storage, face saturation, and risk large-scale flows of CO, back to 
the atmosphere”. The chemicals pathways may reduce net emissions by 
displacing fossil fuel use, but will not contribute to net removal unless 
they are paired with direct air capture ina net zero world. Building mate- 
rials face a challenging route to market penetration owing to regulatory 
barriers, which may take decades to surmount. In general, low TRLs 
will also challenge the ability of pathways to scale rapidly enough and 
within the desired timeframe for mitigation’. The uncertainty in future 
outcomes is relatively large, and very few industries globally involve 
over 1 Gt yr’ of material flows. 

The net climate impact of the CO, utilization pathways will, in many 
cases, depend upon the emissions intensity from the prevailing pro- 
cesses”’, For instance, CO,-EOR might currently contribute to an overall 
reduction in atmospheric CO,, compared to business-as-usual*’. As 
decarbonization proceeds, however, the climate benefit of CO,-EOR is 
reduced. Atsome point before full decarbonization, EOR without direct 
air capture will result in a net increase in CO, emissions”. Conversely, 
in an economy with high supply-chain emissions, the climate benefit 
from BECCS is low”. Ina decarbonized world, those supply-chain emis- 
sions will be close to zero and so the climate benefit from BECCS will 
be amplified. 

Each of the utilization pathways described here should be seen as a 
part of the cascade of mitigation options that are available. For instance, 
using recycled organic matter to reduce fertilizer use and its associ- 
ated emissions is a priority, followed by the more efficient use of fer- 
tilizer”, followed by increasing urea yields to reduce total emissions 
(viamoreefficientuse of NH;)*°. Eventually, fertilizers derived from fossil- 
fuel-free ammonia” should be used to supplement fertilizers derived 
from organic materials. Similarly, a robust finding in the literature on 
integrated-assessment modelling is that the electricity sector should be 
decarbonized first, whichthen facilitates decarbonizationin other, more 
difficult sectors”. In terms of the climate impact per kWh of electricity 


use, available renewable electricity is more efficiently directed towards 
e-mobility and heat pumps rather than towards hydrogen-based CCU 
technologies in the chemical industry”. 


Future priorities for CO, utilization 


Given the slow nature of the innovation process and the urgency of the 
climate problem, priority should be given to the most promising and 
least-developed options so that early and effective adoption of a port- 
folio of techniques can be achieved. For the pathways with apparently 
negative cost (that is, those that should be profitable in the absence 
of atheoretical CO, subsidy), the challenge—particularly for the open 
pathways—is to identify and overcome the other barriers to adoption. 

Animportant caveat for policymakers and practitioners is that scaling 
up CO, utilization will not necessarily be beneficial for climate stability; 
policy should not aim to support utilization per se, but should instead 
seek to incentivize genuine emission reductions and removals ona 
life-cycle basis, and thus provide incentives for the deployment of CO, 
utilization that is climate-beneficial. 


Conventional utilization pathways 

The emissions-reduction potentials of the three cycling pathways would 
be facilitated by declines in the costs of CO, capture. New sorbents could 
reduce the cost of energy-intensive separation of CO, from flue gases 
and industrial streams*°”. In the longer term, cheaper direct air capture 
(based onclean energy) would support the scale-up of these pathways”. 
The cost of DACCS has recently been assessed to be between $600 and 
$1,000 per tonne of CO, for the first-of-a-kind plant, with nth-of-a-kind 
costs potentially of the order of $200 per tonne of CO,”. 

Research into materials and catalysts for CO, reduction could enable 
the efficient transformation of CO, into a broader range of products 
at a lower cost’®. This includes the development of catalysts for the 
efficient production of syngas via dry reforming of methane with CO,; 
efficient photo/electrocatalysts to release hydrogen from water; photo/ 
electrocatalysts that can reduce CO,; or new high-temperature, revers- 
ibly reducible metal oxides” to produce syngas using concentrated 
sunlight. New membrane materials that can separate miscible liquids— 
for example, methanol and water—will also be important®. Catalytic 
processes can be optimized to increase CO, emission reductions or 
to reduce energy consumption®™. One important research challenge 
is to produce materials with the highest material property profiles, 
in particular temperature stability and wider operating or processing 
temperature windows. Rigorous, realistic techno-economic analyses of 
these scientific advances could determine their contribution to valu- 
able cost reductions. 

Given the rapid rate at which human societies are urbanizing®, there 
is an urgent one-time opportunity to deploy new building materials— 
including wood, as discussed below—that utilize and store CO, and 
displace emissions-intensive Portland cement. In this area, as in others, 
progress would be aided by techno-economic analyses and life-cycle 
analyses with clearer system boundaries, counterfactuals, and account- 
ing for co-products®, and integrated modelling frameworks that can 
co-assess changes in background systems™. 


Non-conventional utilization pathways 

Figures 1 and 3 suggest that land-based biological processes offer a large 
opportunity to utilize, remove and store more CO,. Progress here is 
partly dependent upon field-based trials to improve understanding of 
the system-wide impacts of different pathways on plant yields and the 
impacts on water, food and water systems, and other resources. Such 
research might prioritize multiple-land-use approaches, such as agro- 
forestry plantations; rice straw as biomass; low-displacement bioenergy 
strategies such crassulacean acid metabolism plants on marginal land; or 
nipa palm in mangroves. A better understanding of soil carbon dynamics 
and improved phenotypic and genotypic plant selection will also help®. 
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Perspective 


Biochar is currently at alow TRL and has associated uncertainties. 
However, if these can be overcome, its position low on the cost curve 
in both low and high scenarios suggests that this pathway may have 
considerable potential. A major challenge is to improve variations in 
yield effects, which are likely to hinder the economic decision made 
by farmers to apply biochar®, and to find ways to secure potential 
revenue streams. 

Increased forestation, where land availability and biodiversity con- 
straints allow, and the greater use of wood products in buildings are 
strategies that appear to be worth pursuing. Although our estimates con- 
sider the scale-up of existing industrial roundwood use via afforestation 
and reforestation, new wood-based products suchas cross-laminated 
timber and acetylated wood®’”—which are aimed at new markets— also 
have potential. Specification, quality and safety measures for these 
products are approaching comparability to many concrete structures®, 
and current manufacturing scale-up suggests that this may be a market 
with strong growth prospects. 


Cross-cutting efforts 

Broad policy and regulatory changes that may support the appropriate 
scale-up of CO, utilization include creating carbon prices of around 
$40 to $80 per tonne of CO,—increasing over time—to penalize CO, 
emissions” and to incentivize verifiable CO, emissions reductions and 
removals from the atmosphere. We do not advocate a direct subsidy 
for utilization. Instead, incentives for CO, removals and reductions 
(or penalties for emissions) are justified, and these will support CO, 
utilization in cases in whichit is beneficial for the climate. For instance, 
our analysis suggests that closed pathways with scalability—such 
as BECCS and building materials—would be sensitive to a subsidy 
for CO, removals. Changes to standards, mandates, procurement 
policies and research and development support, in order to close 
gaps in knowledge across a portfolio of pathways”, are also desir- 
able. Financing and managing the emergence of a globally impor- 
tant new set of CO, utilization industries will probably require clear 
direction and industrial support from government. An enabling ‘net 
zero’ legislative regime—such as that in place in Sweden and the UK 
and proposed in New Zealand—can provide clarity about the neces- 
sary scale of industries that reduce and remove CO,, including the 
pathways examined here. 

Collaboration between scholars, public officials and business lead- 
ers to ensure accurate comparisons between different alternatives— 
including the direct comparison of CCU, CDR and CCS pathways—could 
facilitate the blending of advantageous features of the ten pathways 
described here, the exploration of pathways not addressed here, andthe 
identification of novel CO, utilization pathways to accelerate emissions 
reductions and removals. 

CO, utilization is not an end in itself, and these pathways solely or even 
collectively will not provide a key solution to climate change. Neverthe- 
less, thereisa substantial societal value in continued efforts to determine 
what will and will not work, in what contexts the climate will or will not 
benefit from CO, utilization, and how expensive it will be. 
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Much of the Earth’s biosphere has been appropriated for the production of 


harvestable biomass in the form of food, fuel and fibre. Here we show that the 
simplification and intensification of these systems and their growing connection to 
international markets has yielded a global production ecosystem that is homogenous, 
highly connected and characterized by weakened internal feedbacks. We argue that 
these features converge to yield high and predictable supplies of biomass in the short 
term, but create conditions for novel and pervasive risks to emerge and interact in 

the longer term. Steering the global production ecosystem towards a sustainable 
trajectory will require the redirection of finance, increased transparency and 
traceability in supply chains, and the participation of a multitude of players, including 
integrated ‘keystone actors’ such as multinational corporations. 
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The demand for harvestable biomass (food, fuel and fibre) by a grow- 
ing, wealthier and increasingly urbanized global human populationis 
placing relentless pressure onthe Earth’s ecosystems. Toa large extent, 
this demand has been met by converting ecosystems into production 
ecosystems—ecosystems modified for the production of one or a few 
harvestable species’”. Although these alterations occur at local scales, 
their cumulative effect is causing global transformation of the Earth’s 
biosphere®*. Humans have already altered more than 75% of the world’s 
terrestrial habitats*—nearly 40% of all productive land has been con- 
verted into agricultural areas° and two thirds of all boreal forests are 
under some form of management, mainly for wood production’. Inthe 
seas, around 90% of large industrial fisheries are either overexploited or 
fully exploited’, and a rapidly expanding aquaculture sector is occupy- 
ing increasing areas of coastal and offshore space’. 

As available productive land and abundant fish stocks become 
progressively scarce, the potential for further land conversion, land 
redistribution and exploitation of new wild stocks as options to meet 
projected global human demand is dwindling®””. To increase efficiency, 
production ecosystems are intensified and simplified using human 
inputs suchas fossil fuels, fertilizers, pesticides, antibiotics and tech- 
nology”*”. In parallel, people, places, cultures and economies are 
increasingly linked across geographic locations and socioeconomic 
contexts”, making production ecosystems increasingly globally 


interconnected. Collectively, these changes are converting much of 
the biosphere into a GPE. 

This new reality calls for approaches that recognize the biosphere sys- 
tem asacomplex and integrated social-ecological system?”. Within 
this context, resilience—the capacity of a system to persist with and 
adapt to change, but also transform away from unsustainable social- 
ecological trajectories—has been suggested as a conceptual framework 
that could assist in developing paths towards sustainability’. Whereas 
the aggregated transformation of Earth’s biomes is indisputable, its 
consequences for the dynamics and resilience of an expanding GPE 
remain poorly understood. 

Here we describe the anatomy of the GPE through the lens of three 
key features underpinning resilience, namely connectivity, diversity 
and feedback”. We do this by considering a diverse set of socioeco- 
nomic and biophysical elements that have previously been studied 
separately. We discuss how this anatomy influences the resilience of 
the GPE and creates novel conditions for risks to emerge and interact. 
We conclude by highlighting three avenues that can foster innovation 
and encourage new partnerships to motivate transformation towards 
amore sustainable GPE. 


The anatomy of the GPE 


The GPEis the result of three important and interacting trends: (1) the 
continued conversion of the Earth’s biosphere into simplified produc- 
tion ecosystems, (2) the increased intensification and dependence of 
these production ecosystems on human inputs, and (3) their expand- 
ing connectivity through global markets. The GPE integrates multiple 
sectors, broadly referred to here as forestry, agriculture (crops and 
livestock) and fishery (wild capture and aquaculture) (Fig. 1). We rec- 
ognize that some production ecosystems, suchas subsistence fishing 
and farming or diversified agricultural landscapes, may be subject to 
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Fig. 1| The global production ecosystem. The GPE is characterized by tightly 
coupled relationships and reciprocal influence within and between 
harvestable biomass (green inner circle), multiple sectors (blue middle circle) 


Finance 


little human input or export-mediated connectivity from international 
trade. Nevertheless, they will be increasingly shaped by a broader set of 
global drivers, suchas policies, technologies and economic changes”. 


Connectivity: the breakdown of isolation 

Adistinctive feature of the present day is the way in which human activi- 
ties increase connectivity. Although the drivers of this connectivity are 
not new (for example, trade, transport, technology and consumption), 
the speed and scale at which it occurs are unprecedented’. 

Connectivity within the GPE is underpinned by long-distance 
biophysical and socioeconomic teleconnections”™. For example, 
irrigation and deforestation for agriculture in one location can redis- 
tribute global evapotranspiration, thereby changing rainfall patterns 
and affecting terrestrial production ecosystems in other regions”. 
Increased CO, emissions associated with deforestation” also affect 
aquaculture and wild-capture fisheries through increased seawater 
temperatures and ocean acidification”. Thus, land transformation 
in one part of the world can have substantial effects on production 
ecosystems at distant locations, within and across sectors. 

At the same time, trade that was once constrained by limitations 
in transport capacities and lack of trade agreements is increasingly 
contributing to match global supply and demand”. International 
trade has undergone huge expansion in the past few decades”, and 
now accounts for 24% of all agricultural land, 23% of the freshwater 
resources used for food production” and more than 35% of global sea- 
food production’. The number of regional trade agreements in force has 
more than tripled” since 2000, and nearly all cropland areas brought 
into production from 1986 to 2009 were used to grow export crops”’. 
As aconsequence, production ecosystems have been further simplified 
and intensified to produce products destined for global markets”**. 

The growth of international trade has also increased direct and indi- 
rect connections between different production ecosystem sectors. 
For example, agricultural exports such as soybean and palm oil pro- 
duced for the European Union, US and Chinese markets are a primary 
driver of deforestation across the tropics”. Sectors have also become 
intertwined through different output-as-input relationships. Increase 
in feed trade to satisfy global livestock production is occurring at an 
unprecedented rate*’, and as the effects of intensification unfold, new 
connections are emerging. For instance, the aquaculture sector, which 
has traditionally relied heavily on capture fisheries as the main source 
for feed, is shifting towards agriculture for crop-based feed (for exam- 
ple, soy, rapeseed and maize) in response to declining fish catches**. 
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and abroad set of distal drivers (grey outer circle). To the right are the three 
lenses (connectivity, diversity and feedback) and their key features through 
which the anatomy of the GPE is described inthis paper. 


The interconnections between sectors are further amplified by the 
emergence of large transnational corporations that link production 
ecosystems globally through their subsidiaries». These vertically and 
horizontally integrated ‘keystone actors’® rely on connectivity for their 
own growth and represent a critical feature of the GPE by operating 
across sectors, markets and geographies to source, store, trade, pro- 
cess and distribute biomass. Such integration allows a few actors to 
dominate all segments of production, control the whole supply chain 
and havea disproportionate influence on decision-making”. Consolida- 
tion of large industrial actors has been recorded across many sectors, 
including forestry, seafood, livestock and agri-food industries””**. 
There are concerns that such consolidation reinforces global homog- 
enization of species (including genes, varieties and crops), practice 
and knowledge*”*°. 


Diversity: more becomes less 

The purposeful selection of particular harvestable products and the 
collateral effects of these choices are driving biotic homogenization 
in both terrestrial and aquatic ecosystems*"*”, In many areas, boreal 
forests have been simplified as a consequence of intensive silviculture 
for timber production’, tropical forests have been replaced by spatially 
extensive monocultures (for example, soy and oil palm plantations)**, 
and native Mediterranean ecosystems have been simplified by exotic 
pine tree plantations“. In grasslands, moderate intensification has 
resulted in collateral biotic homogenization across microbial, plant 
and animal groups, both above and below ground™. In the Amazon, 
rainforest bacterialcommunities have become homogenized asa result 
of land conversion to cattle pasture* and in marine systems, rising 
seawater temperatures have led to the rapid homogenization of fish 
assemblages”. 

Homogenization is also evident from a food production perspective. 
More than 80% of the global fish and shellfish aquaculture production 
is sourced from 30 species, of which grass carp, silver carp, cupped 
oysters, common carp and manila clam account for more than 30% 
by volume®. The pattern is even more striking for the global livestock 
sector, in which the production of pigs and chicken amount to 40% 
and 34%, respectively, of global meat production*. In agriculture, 
national portfolios of food supplies have seen increased crop species 
diversity, whereas globally they have become more homogeneous in 
composition, illustrating a shift towards a globally standardized food 
supply based ona few crop types such as maize, wheat, rice and bar- 
ley’. Homogenization of crop production is further promoted by the 
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Box 1 
Financialization of the biosphere 


After decades of financial deregulation and innovation, the 
intensification of investments in natural assets has led to concerns 
about the role of an expanding global financial sector in shaping 
production ecosystems'*“°. Financialization—defined as the 
increasing importance of financial markets, motives, institutions 
and elites in the operation of the economy and its governing 
institutions'“'—has been suggested as a rapidly emerging and 
powerful decoupling mechanism that abstracts biomass from 

its physical form®. For example, new financial practices and 
instruments, such as securitization and complex commodity 
derivatives, have led to a situation where only 2% of commodity 
futures contracts end with delivery of the physical good®®. In 
environmental conservation, financialization is seen as a new 
frontier for capital investment, in which the responsibility for 
global environmental outcome is increasingly shifted towards 

the incentivizing control of investment finance”. The green bond 
market, which is designed to simultaneously achieve financial 
returns and environmental benefits, has witnessed huge growth 
over the past five years and is predicted to reach US$250 billion 

in 2019 (Box 1 Figure). Attention is also shifting towards the role 

of finance within the blue economy narrative™’. Similar to green 
bonds, the Seychelles government has announced the world’s first 
blue bond, valued at US$20 million to fund sustainable fisheries*. 
However, studies have warned that the growing interest among 
financial institutions in investments in the seafood sector may 
lead to adverse effects on small-scale fisheries through increased 
privatization and ocean grabbing™. The use of fishing quotas 

as collateral in loans by Icelandic banks, and the resulting debt 
for the industry when the banking system collapsed, provides a 
compelling example”. If financial actors were to become aware 
of how ecological risks translate into financial risks, entry points 
for sustainability considerations into financial decisions might 
emerge with strong incentives to implement better standards and 
redirect capital towards more sustainable practices"’. Regardless, 
financialization of the biosphere should be increasingly recognized 
and studied as an intrinsic process shaping the GPE. 
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Box 1 Figure | Rise of the global green bond market. Data are 
from the Climate Bond Initiative (https://www.climatebonds.net). 
The 2019 data are forecast (indicated by an asterisk). Note that 
while the growth of the green bond market is unquestionable, 

it represents only a tiny fraction of the overall bond market, 
estimated to exceed US$100 trillion. 
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recent rise of ‘flex crops and commodities*. These are commodities 


that are suited for multiple uses that can be flexibly interchanged (for 
example, soy as food for humans, feed for animals, or biofuel; or trees 
for timber, pulp, ethanol, or carbon sequestration purposes). Such 
commodities provide flexibility for producers and investors to allocate 
products depending on which market has the highest demand—for 
instance, in the face of changes in policy regulations, market prices 
or technological advances®. 


Feedback: decoupling in a hyperconnected world 

Paradoxically, increased connectivity within and among production 
ecosystems is weakening important feedback relationships within 
the GPE. First, there is broad evidence that intensification decouples 
production ecosystems from the natural processes needed to sus- 
tain desired production outcomes (that is, regulating and supporting 
ecosystem services)*. Instead, human inputs are increasingly used 
to mimic natural processes and responses in the system. Examples 
include substituting the natural breakdown and uptake of nutrients 
(that is, nutrient recycling) with fertilizers to enhance crop productiv- 
ity, relying on artificial feed inputs to increase aquaculture yield, and 
replacing natural pest control with pesticides and herbicides to avoid 
yield losses. This can potentially undermine the capacity of production 
ecosystems to sustain desired biomass in the long run. For instance, 
agricultural intensification has been reported to cause soil erosion, 
declines in fertility, loss of natural pollinators, downstream damage 
to water resources and degradation of coastal ecosystems***". 

Decoupling also emerges as the geographical distance between the 
location of biomass production and where it is consumed increases. 
Approximately one quarter of all food produced for human con- 
sumption is traded internationally”, and almost one billion people 
are consuming internationally traded products to cover their daily 
nutrition™. Estimates further suggest that 20% of global cropland is 
being allocated to the production of commodities that are consumed 
in another country”. This spatial decoupling, or ‘distancing’, allows 
industries to substitute supplies from different species or production 
ecosystems so that global consumers remain relatively unaffected 
by, and unaware of, changes occurring at individual source areas™. 
Declining fish stocks, for example, are compensated for by substituting 
source areas”, shifting to new but similar species™ or replacing wild 
catch with supply from aquaculture. Similarly, international trade 
enables countries to displace their land use (for example, deforesta- 
tion) to other nations®. As long as consistent demand exists through 
globally distributed markets, implementation of policies to mitigate 
overexploitation in one place—such as protected areas or reduced 
quotas—may simply increase pressure elsewhere (leakage effects), 
with a global net decline as a result!®. 

The current global model of biomass production also spatially decou- 
ples consumption from the environmental impacts that it entails®. 
These impacts extend beyond direct collateral damages, such as spread 
of infectious diseases, pollution, habitat degradation and loss of bio- 
diversity. They include reallocation of natural resources (for exam- 
ple, land and water) needed to produce traded commodities destined 
for direct human consumption or as input to produce biomass with 
higher protein and nutrient content®°**, The trade of these embodied 
resources (virtual trade) has been estimated to incorporate 24% and 
22% of the global land and water footprint®®, respectively, and account 
for 11% of global groundwater depletion”. 

More recently, attention has been given to the way in which decou- 
pling may arise from the growing influence of finance and the emer- 
gence of novel financial instruments (Box 1). Newtypes of agricultural 
insurance have been developed whereby payouts are no longer based 
ondirect measured loss of crops, but are instead triggered by anindex, 
suchasa predefined threshold in rainfall®. Although these index insur- 
ance policies present benefits for both insurers (by resolving the prob- 
lem of moral hazard and reducing the transaction costs of verifying 


Box 2 


A network of networks 


The GPE is a worldwide social-ecological network of networks. It 
is composed of a large number of interacting networks that span 
sectoral, jurisdictional and geographical boundaries, connect 
various actors and institutions, and link human societies to the 
biosphere. Individual nodes in this global network can represent 
countries, actors, institutions, sectors, species or ecosystems. Links 
can capture collaboration, trade, policy overlap, environmental 
effects, species dispersals or trophic interactions”. A network- 
modelling approach can help to uncover invaluable clues to the 
resilience of the GPE. For example, the degrees and patterns of 
connectivity across the many networks can have important 
shock-amplifying or shock-dampening implications’/°7°”""4°°. 
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Another promising, but nascent area of research is the investigation 
of how well social and ecological systems are aligned within a 
social-ecological network™® and empirical assessment of how this 
social-ecological fit affects outcomes in the system™®. For example, 
early results suggest that high levels of social-ecological fit 
provide an important foundation for more sustainable and adaptive 
practices to emerge” (Box 2 Figure). Future research should aim 

at exploring the nature and extent of social-ecological fit in the 
GPE, but also whether enhancing social-ecological fit could lead 

to mechanisms that unravel and expose different masking effects, 
such as land displacement, sequential overexploitation and virtual 
trade. 
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Box 2 Figure | Enhanciological fit in fisheries. 

a-c, The blue nodes represent interconnected fish stocks (for 
example, through dispersal and seasonal migration) being 
targeted at different locations by different fishing fleets (red 
nodes). For highly mobile fish species such as tuna, the distance 
between any two localized stocks can be very large and cross 
several jurisdictional borders. a, If actors (fishing fleets) are 
operating independently of each other, the ecological and 
social connectivities are not aligned. This makes it more difficult 
to respond collectively to changes and disturbances, and can 
incentivize actors to (knowingly or unknowingly) overharvest 
fish stocks. b, Overfishing in one location will negatively affect 
the status of the fish stock in the other location, and vice versa. 


losses) and farmers (by improving access to credit and mitigating 
climate risk), they are often coupled to the adoption of commercial 
inputs and specific crops that reinforce the simplification of agricul- 
tural landscapes and the homogenization of practices. This increases 
smallholders’ exposure to risks and erodes their ability to adapt to 
extreme environmental variability®°. Because the actual agricultural 
performanceis no longer relevant for the indemnity payment, farmers 
are also at risk of experiencing losses but not receiving a paymentif the 
index threshold is not met. 

Collectively, these different decoupling mechanisms have ramifica- 
tions for how production ecosystems and the benefits they produce 
are perceived, valued and managed?! 


Resilience in the GPE 

Resilience is a concept that is widely used in science, management 
and policy. The concept has multiple meanings, which can have con- 
sequences for evaluating, understanding, and managing systems, 


Overharvested fish stocks 


Catch 2 Catch 1 Catch 2 


Reducing incentives for overfishing 


c, Bringing the actors closer together (for example, through 
exchanging information or agreeing on common regulations) would 
increase the possibilities for tightening social-ecological feedback 
loops (recoupling), thus reducing incentives for overfishing’*° and 
increasing the likelihood of the emergence of collective action”. 
Such increased social-ecological fit could be established either 
by connecting the different actors directly, or indirectly through 

a third-party mediator. Examples of direct connections include 
the emergence of fishing collectives™' or cooperation among 
global seafood companies. An indirect connection could occur 
through a regional fishery management organization, in which the 
participating member states adhere to commonly devised fishery 
regulations for stocks over which no single state has full authority. 


depending on which definition is used. Resilience can refer to the time 
it takes for asystem to return to its original state after perturbation 
(recovery) or, asin this Perspective, it can describe the extent to whicha 
system can develop with change by absorbing recurrent perturbations, 
deal with uncertainty and risk, and still sustain its key properties”, such 
as the capacity to feed humanity in the case of the GPE. Concerns have 
been raised that the profound human influence on the biosphere is 
eroding resilience and causing abrupt changes in social, ecological and 
social-ecological systems. These ‘regime shifts’ may interact and 
cascade™, thereby producing change at very large scales with severe 
implications for the wellbeing of human societies®. Since the GPE has 
become a substantial part of the biosphere, investigation of what a 
hyperconnected, homogenized and decoupled anatomy means for 
its resilience is urgently needed. 


The structure of fragility 
Analysing systems as networks that consist of nodes and links has 
proved to bea fertile ground for exploring the relationship between 
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Fig. 2| Masking loss of resilience. a, The state of alocal low-intensity 
production ecosystem (blue dot) is maintained by a suite of biophysical 
processes (red arrow). Variability in environmental conditions creates 
fluctuations in biomass output (blue bars). This variability may not bean 
acceptable solution as drops in the production may not sufficiently meet the 
needs of people depending onit. b, Alocal high-intensity production 
ecosystem is kept ina forced state by continuously adding anthropogenic 
inputs, suchas increasing use of antibiotics to avoid diseases in aquaculture 
and livestock, and herbicides to prevent weeds incrop systems. Intensification 
increases productivity and suppresses fluctuations in harvestable biomass in 
the short term (blue bars). This occurs at the expense of eroding resiliencein 


structure and resilience in ecological®”, financial®*”°, technologi- 
cal”)? and climatic systems”. Depending on how nodesare organized, 
connectivity can increase or decrease the resilience in a network”. A 
recent empirical reconstruction of the global food trade network from 
1986 to 2013 showed that it displays characteristics of a heterogene- 
ous network in which countries have many incoming (import) and few 
outgoing (export) connections, or vice versa”. The study also found 
that the food system has become progressively delocalized as a result 
of globalization (that is, modularity has been reduced). Combining 
these properties, the authors concluded that resilience in the global 
food network has declined over the past 20 years and that addition of 
new trade routes to this heterogeneity will further erode resilience”. 

Another important line of research focuses on the interaction 
between connectivity and diversity in the network (that is, how nodes 
are different from each other)”. Studies suggest that, in systems in 
which the diversity of responses among nodes is high and connectivity 
between them is low, the systemic response to perturbation is gradual. 
By contrast, if nodes are homogeneous and highly connected, their 
responses become more synchronized””>””. The global financial crisis 
provides an illustrative example in which a small number of tightly 
connected banks deployed similar risk-management models, thus cul- 
tivating homogeneity at the global scale and paving the way for shocks 
to propagate throughout the financial system®. Connectivity and diver- 
sity therefore determine whether a system has a shock-dampening or 
a shock-amplifying effect when exposed to perturbations”. Linking 
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the long term (dashed line and black arrow), which increases the risk of 
surpassing a threshold beyond which the system may fall into a degraded state, 
precipitating a collapse of biomass. c, Similarly, the GPE (represented by the 
Earth) is kept ina forced state through intensification, trade and spatial 
displacement of activities (red arrow), to maintain a high and predictable 
global supply of biomass arriving from different stocks, species, geographic 
locations (multi-coloured bars). Loss of resilience (dashed line and black arrow) 
is masked at a global level, thus increasing the risk of shifting the GPE into an 
unknown state. To the right are systems within which examples of the 
illustrated dynamics can be found (see Supplementary Table 1). 


networks together can help to reduce pressure in individual networks, 
but may occur at the expense of increasing fragility of the broader 
interconnected network”. Indeed, studies in power-communication”, 
financial®’ and ecological™ systems have shown that a large intercon- 
nected ‘network of networks’ can be intrinsically more fragile than 
each network in isolation. 

Inthe GPE, intensification and globalization have produced strong 
interdependencies within and among sectors. In parallel, homogeniza- 
tion has reduced the diversity of ways in which species, people, sectors 
and institutions can respond to change (loss of response diversity)”® 
as well as their potential to functionally complement each other (loss 
of redundancy)*””. This suggests that the GPE possesses features that 
could amplify shocks®°. Understanding such potential shock-amplify- 
ing behaviour will require a better evaluation of how ecological, social 
and social-ecological connectivity and diversity interact (Box 2). 


Masking loss of resilience 
Fluctuations in harvestable biomass outputs influence producer 
income and undermine the continued and stable supply to consum- 
ers (Fig. 2a). Strategies that reduce this variation to improve efficiency 
and predictability are therefore frequently sought. However, enhanced 
short-term control can have implications for resilience inthe long term. 
Increasing variability (variance) can bea signal of declining resilience 
in complex systems, including ecosystems and social-ecological sys- 
tems” (but see ref. *'). Hence, intensification strategies that deliberately 


Box 3 


Linking anatomy to resilience: empirical examples from food 


production 


The resilience of a given system depends on the interplay between 
exogenous (external) forces and the endogenous (internal) 
properties of the system (Box 3 Figure). There is growing evidence 
that key external and internal dynamics of the GPE have been altered 
to the extent that they are compromising its resilience. Here we 
consider and empirically illustrate two examples of such dynamics. 


Altered perturbations and increasing food-shock frequency 

As connectivity and homogeneity increase, shocks within a 
geographic area or sector can become globally contagious and 
more prevalent®”. Major drivers of these shocks include extreme 
weather events, spread of disease and geopolitical or economic 
conflicts, although their relative importance can differ across 
regions. For instance, droughts and floods have been particularly 
dominant forces of sudden declines in crop production over the 
past decades in South Asia, whereas geopolitical and economic 
crises are leading drivers of agricultural shocks in sub-Saharan 
Africa®®. Furthermore, geopolitical and economic events—such 

as withdrawal of subsidies, reduced export markets and internal 
conflicts—tend to generate shocks that span multiple sectors across 
both land and sea®. Importantly, despite efforts to maintain high 
yields, food production shocks have become more frequent over 
the past 50 years (Box 3 Fig. a). These shocks pose a threat to food 
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Box 3 Figure | Linking anatomy to resilience. a, b, The schematic 
of the stability landscape illustrates two ways by which exogenous 
forces (a) and changes in endogenous system properties (b) can 
lead to state shifts (dashed arrows) in food production systems. 

a, The annual increase in global food production shock frequency, 
including fisheries, aquaculture, crop and livestock sectors. 


suppress variance may remove a useful warning of declining resilience 
in production ecosystems, sectors and the broader GPE™. Variance is 
often suppressed by controlling stress and stochastic perturbations 
such as grazing, fire and pest outbreaks. Such events have been pro- 
posed to increase system resilience in the long term by selecting for 
particular tolerant genes, species traits or practices®**. Therefore, pre- 
venting these events may gradually erode resilience, making systems 
more vulnerable to disturbances that could previously be absorbed. 
Suppressing short-term variance can also lead to an accumulation of 
variance in the longer term®. As variance accumulates, more force 


security and the resilience of the global food system through price 
volatilities and effects on trade®. 


The looming threat of herbicide resistance 

Homogenization towards pesticide-intensive production practices 
in agroecosystems has increased selection for pesticide resistance 
and reduced the resilience of the GPE to pests and pathogens™”. 
For instance, according to The International Survey of Herbicide 
Resistant Weeds (www.weedscience.org), as of 2019, weeds have 
evolved resistance to 167 different herbicides and to 23 of the 26 
known herbicide sites of action. Glyphosate is currently the most 
commonly applied weedkiller, accounting for approximately two- 
thirds of all herbicide use globally. First introduced in 1974, the 
application of glyphosate began to escalate in the mid-1990s with 
the rapid adoption of glyphosate-resistant transgenic crops, which 
enabled farmers to use glyphosate liberally as a strategy to maintain 
and enhance yields (Box 3 Fig. b). However, such short-term 
damping of variability can erode resilience in the long term®. The 
overreliance on glyphosate rapidly accelerated the evolution and 
spread of glyphosate-resistant weeds across all major economies 
(Box 3 Fig. b), ultimately creating conditions for a looming global 
failure of weed management". 
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The confidence interval (orange shaded region) represents the 
range of plausible shock frequencies under different model 
parameters used for shock detection®°. b, The growing number of 
glyphosate-resistant weeds (black line) as global glyphosate use 
(blue bars) increases. See Supplementary Note 1 for methodology 
and data sources. 


(that is, human input) is required to maintain the system in a desired 
state (Fig. 2b). Resilience under such conditions has been described as 
‘coerced’. In forest production ecosystems, for example, stochastic 
wildfires are often curbed to maintain high and stable yields of har- 
vestable biomass. However, small-scale fires have an important role 
in reducing the accumulation of dead wood and creating a heteroge- 
neity of patches with less-flammable species that reduces the risk of 
ignition and prevents fires from propagating®™**. This allows the sys- 
tem to suffer fire without eminent risk of crossing a critical threshold 
whereby it becomes catastrophic and uncontrollable. By contrast, when 
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small-scale wildfires are suppressed, homogeneity increases and the 
amount of wood fuel piles up. This creates a situation in whicha single 
ignition could potentially set the whole forest on fire. Ifa catastrophic 
fire unfolds, it can start to interact with the atmosphere and generate 
convection-driven winds, which further increase its size, spread and 
speed, making the fire unstoppable®. Consequently, management 
aimed at controlling short-term variability breeds systemic vulner- 
ability in the long run® (Box 3). 

The anatomy of the GPE provides for spatial suppression of and accu- 
mulation of variance at a global level because components of the system 
(for example, sectors, places and stocks) are often viewed and governed 
inisolation (Fig. 2c). This global coercion of resilience is facilitated by 
sequential exploitation and displacement of activities. For example, 
countries transitioning from net deforestation to net reforestation 
may doso through geographic substitution®. In Vietnam, forest cover 
increase was achieved at the expense of deforestation in neighbouring 
countries such as Cambodia and Laos”. Similarly, following the collapse 
of the North Sea cod (Gadus morhua) population, UK imports shifted to 
Atlantic cod sourced from Iceland and the Faeroes, leaving UK consum- 
ers relatively unaffected and unaware™. Such decoupling mechanisms 
could explain why national supply stability tends to increase as coun- 
tries’ reliance on trade grows®, although it may contribute to global 
instability in the long term”. 


Altered disturbance landscape and systemic risks 

Resilience management has generally focused on local systems and 
their capacity to deal with a narrow range of well-known shocks®”, such 
as drought, fire, pest outbreaks and, increasingly, climate change. 
However, perturbations that previously had only minor or no effects 
onacertain production ecosystem may suddenly become significant 
as sectors are progressively intensified and intertwined. For example, 
droughts or crop pest outbreaks may cause disruption in seafood pro- 
duction, as the aquaculture sector is now dependent on agriculture 
for crop feeds®”. Moreover, the GPE has become increasingly exposed 
to price fluctuation in inputs (for example, fossil fuels, fertilizers and 
technology)”, shifts in global consumer preferences (for example, 
diets)”!, changes in policies (for example, regulations on energy and 
exports)” and speculation on food commodities”. Concerns have also 
been raised about the vulnerability of the infrastructure network on 
which trade of biomass relies, such as choke points in maritime trans- 
portation, which could generate significant instabilities if disrupted”. 

As connectivity and homogeneity increase, shocks that were previ- 
ously contained within a geographic area or a sector are becoming 
globally contagious and more prevalent (Box 3). For example, pro- 
tectionist trade strategies, such as implementation of export bans 
following droughts to protect populations in producing countries, 
can affect nations that rely on trade to balance their food needs”. 
Interest in these types of interconnected risks has increased in recent 
years along with the terminology to describe them, including nested 
and teleconnected vulnerabilities”, hyper-risks”, femtorisks”, global 
systemic risks and Anthropocene risk”. They stem from interactions 
at the interface of multiple systems (for example, climatic, ecological, 
political, financial and technological), making causal links opaque and 
outcomes difficult to foresee”. 

Despite the inherent uncertainties, this broad spectrum of perturba- 
tions and interconnected shocks must be considered to adequately 
manage resilience inthe GPE. It also suggests that the limits of the GPE 
in satisfying demands for harvestable biomass may be set by the poten- 
tial consequences of these emergent risks, as opposed to hard upper 
limits to production per se. The future will require confronting risks 
that we know little about®”®, such as the consequences of an expand- 
ing global financial sector (Box 1) and new technologies, including 
the growth of genetic engineering and synthetic biology”. It will also 
entail accounting for interactions with non-biomass producing sectors 
that, for instance, support critical infrastructure or energy supply. 
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Competition between production sectors for land and resources (most 
importantly water) is indeed likely to intensify as demand continues 
to grow and effects of climate change unfold. 


Towards a sustainable GPE 


Providing a growing human population with food, fibre and fuel ina 
sustainable and fair way is one of the grand challenges facing humanity. 
Althoughthe GPE has offered huge benefits by increasing the produc- 
tion of certain desired species”, the intensification and simplification 
of production ecosystems have been criticized from ecological’*", 
social’””’ and social-ecological perspectives’”°. Consequently, we argue 
that it should be substantially and deliberately transformed towards 
a sustainable trajectory, on which: (1) the demands for biomass are 
met ina fair and just way, without undermining the functioning of the 
biosphere, (2) connectivity is capitalized on to improve sustainabil- 
ity, (3) biological and social diversity is enhanced to ensure building 
blocks for adaptability and transformation in the face of change, and 
(4) feedback loops are strengthened (recoupled) to avoid masking 
effects and coercion of resilience. 

Determining the boundary conditions characterizing a sustainable 
GPE is a challenging task that will involve a mix of approaches. The 
planetary boundaries framework" can be used to define global and 
regional limits in biophysical processes—the ‘safe operating environ- 
mental space’—that must not be transgressed if humanity is to stay 
away from systemic and potentially irreversible shifts in the biosphere. 
For example, this framework was recently applied to quantitatively 
estimate how to keep the global food system within environmental 
limits. Combined with the aspirational social goals framework (‘safe 
and just space for humanity’)'’, this can provide a starting point for 
discussions around levels of acceptable risk and trade-offs between 
productivity, sustainability and equity'™. 

Steering the GPE towards a sustainable trajectory will also requirea 
combination of more specific strategies and solutions, as well as careful 
consideration of their feasibility and the trade-offs involved. Although 
the polarized debate between the integration (land sharing) and separa- 
tion (land sparing) of conservation and production fits into discussions 
around food production and land scarcity, it is ill-suited to address 
issues of scale (for example, temporal variation in agricultural land 
use patterns and total area for conservation) or effects of globalization 
(for example, displacement activities)’. The land-sparing versus land- 
sharing debate is too often framed asa binary choice, ignoring possible 
middle ground and cross-fertilization. Within this context, sustainable 
intensification has gained momentum in discussions around global sus- 
tainability and has become a policy goal for many institutions to deliver 
onglobal social and environmental commitments (for example, the UN 
Sustainable Development Goals and the Paris Agreement). However, it 
has also been criticized for having a narrow focus on efficiency gains 
and technological interventions’”’. More systemic forms of sustain- 
able intensification have therefore begun to occur at large scales and 
across a wide range of agroecosystems, to redesign the composition 
and structure of production ecosystems and harness a broad range of 
ecological processes suchas predation, parasitism, herbivory, nitrogen 
fixation and pollination’”. Further efforts towards a sustainable GPE 
include approaches to ensure more stable food supplies by increasing 
national crop diversity!°°, broad-scale shifts in diets and strategies to 
reduce loss and waste of biomass!™, and the integration of local realities 
and contexts, suchas procedural justice and equitable distribution of 
benefits from multi-functional land and seascapes”. 

Although these initiatives are contributing to sustainability in 
important ways, they are being challenged by an expanding GPE in 
whichsystemic, sectoral and jurisdictional boundaries are increasingly 
blurred. Acting on this new reality entails creating conditions that foster 
innovation, incentivize transformation and encourage new partner- 
ships across different sectors and actor groups”. For this reason, we 


propose three entry points towards a more sustainable GPE that have 
great transformative potential but are still in their infancy. 


Redirecting finance for sustainability 

Financial investments—public or private—are increasingly recognized 
as key leverage points for achieving sustainability” '°. Government 
subsidies channel large amounts of public capital into the different sec- 
tors of the GPE, ultimately influencing practices and species production 
on the ground. Whereas subsidies have mostly been associated with 
unsustainable practices, such as fuelling over-capacity in the fishing 
industry, they could also provide powerful incentives for improved 
sustainability if linked to the right criteria. In the European Union’s 
reformed Common Agricultural Policy, for example, a direct payment 
scheme is used to incentivize sustainable resource management, in 
which farmers who comply with greening measures (that is, address- 
ing biodiversity loss, avoiding crop monoculture and securing car- 
bon sequestration) benefit financially from payments (but see ref."»). 
Another recent government-led actionis the alliance of Central Banks 
and Supervisors Network for Greening the Financial System (NGFS), 
formed during the One Planet Summit in 2017 to explore the role of 
and possibilities for central banks to use their mandate to incentivize 
economies to transition to more sustainable pathways”°. 

Private financial actors such as asset managers and commercial banks 
channel the bulk of capital behind the expansion of the GPE by investing 
in or lending to companies in different production sectors. Although 
direct causality between financial flows and environmental change is 
often opaque!”, such investments represent a potential source of 
influence over corporate practices. Shareholders of publicly listed 
companies have the ability to affect a firm’s sustainability performance 
by exercising their voting rights at shareholder meetings (shareholder 
activism). They can engage directly with the corporate leadership on 
governance and policy, or indirectly through chains of ownership and 
threats of divestment. For example, the world’s largest sovereign wealth 
fund—Norway’s Government Pension Fund—has divested from 32 com- 
panies involved in unsustainable palm oil production since deforesta- 
tion became an ethical criterion in 2012 (https://www.nbim.no). The 
insurance sector could also provide important leverage towards more 
sustainable practices—for instance, by refusing to insure fishing ves- 
sels associated with illegal, unreported and unregulated fishing”®. 
Similarly, loan covenants (that is, the specific conditions associated 
with credit lending) provide a powerful tool for banks to influence the 
behaviour of borrowing companies operating in the GPE by denying 
access to clients that do not comply with sustainability standards and 
providing incentives so that better sustainability performance results 
in reduced interest rates”. In this context, pressure from governments 
and finance ministries will be essential to promote new norms and 
regulations that can align banks, financial markets and other investors 


with sustainability goals”. 


Radical transparency and traceability 

Consumers can be influential in promoting sustainability by aligning 
their purchasing with sustainable thinking. They are also important 
as citizens whose perceptions and opinions drive the political will to 
address sustainability issues. Education and provision of information— 
suchas certification, labelling schemes and public campaigns—are 
therefore central instruments for consumers to make informed deci- 
sions. However, if as a society we do not know where, how, in what 
quantity and by whom a given commodity is produced, it is arguably 
difficult to tackle sustainability challenges”. 

Whereas transparency is necessary to assess the environmental 
sustainability of corporate and financial activities, traceability rep- 
resents a key mechanism by which corporations can ensure that their 
supply chains are devoid of unacceptable behaviour, ranging from 
illegal sourcing and forced labour to poor sanitation and mislabel- 
ling!”°-2, Many of the operations of the corporate and financial world 


are still plagued by opaqueness”*”””’, including secrecy around finan- 


cial transactions and corporate loans”, as well as poor disclosure on 
implementation of corporate policy and internal allocation of capital 
(see https://www.ifrs.org/). 

Radical transparency and traceability require the disclosure of 
production volumes and practices. It also demands that corporate 
and governmental policies are put in place to ensure that social and 
environmental criteria are met in all supply chain segments, as well as 
mechanisms to monitor how such regulations are implemented and 
enforced’”*”>. To date, improved corporate disclosure has largely been 
driven by voluntary action” under the scrutiny of non-governmental 
organizations (NGOs). Whereas the Global Reporting Initiative (https:// 
www.globalreporting.org) is a prominent example of widely adopted 
sustainability reporting standards, the more recent World Benchmark- 
ing Alliance (https://www.worldbenchmarkingalliance.org/) encour- 
ages companies to disclose information that allows evaluation of their 
operations in relation to industry benchmarks. Even though mandatory 
reporting is increasing globally, limited regulation contributes to poor 
transparency and sustainability-related corporate reporting remains 
voluntary in many jurisdictions (see https://www.carrotsandsticks. 
net). More stringent and clearly articulated criteria for disclosure there- 
fore represent animportant step towards more transparent corporate 


practices. 
Emerging digital technologies that deliver decentralized systems, 
suchas blockchain”®””, could resolve some of these issues and improve 


traceability inthe GPE. However, these technologies are energy-inten- 
sive and interoperability remains a hurdle because seamless communi- 
cation between digital platforms and agreed-on data for transmission 
are largely unrealized”®. Thus, barriers to chain-wide traceability are not 
just technological but also organizational, and will require changes in 
legislation and the institutions that govern trade to stimulate coopera- 
tion throughout supply chains™. 


Keystone actors as global agents of change 

A key facet of sustainability science is that the identification of chal- 
lenges and their solutions requires collaboration between researchers 
and actors from outside academia”. Generally, these actors encompass 
local communities, indigenous groups, management agencies, NGOs and 
government actors. More recently, however, increasing attention has 
been directed towards large transnational corporations and their role 
as athreat to, oras an opportunity for, sustainable transformation?”2°), 

Private governance raises concerns associated with accountability, 
fair representation and global equity*®. Nevertheless, transnational 
corporations have become a central feature of the GPE (that is, keystone 
actors), with a capacity to influence practices across supply chains 
and geographical locations®, and thus have the potential to become 
powerful agents of change for improved sustainability”. An increasing 
number of private sector initiatives is emerging with the intention to 
mobilize companies to take tangible actions, make investments and 
form partnerships to deliver on sustainability”. 

Scientists have an important role to play in this context, acting as 
independent knowledge brokers to ensure that the agendas of keystone 
actors are based on scientific evidence and align with long-term sus- 
tainability goals. Seafood Business for Ocean Stewardship (SeaBOS) 
provides an unconventional example of a co-production initiative in 
whichscientists directly engaged with the world’s largest seafood com- 
panies to stimulate transformative change towards improved ocean 
stewardship’. Drawing on an empirical identification of the largest 
companies involved in aquaculture and wild-capture fisheries this 
global science-business initiative emerged in 2016 with a number of 
task forces led by member companies in collaboration with and sup- 
ported by scientists (https://www.keystonedialogues.earth). While 
the long-term outcome remains to be evaluated, the 10 companies 
engaged in SeaBOS can influence the strategic direction of more than 
600 subsidiaries with operations in at least 90 different countries!”. 
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Although this presents a promising approach to be replicated in other 
sectors in the GPE, such engagements do not come without risk. For 
scientists, they may cause reputational damage and loss of credibility 
if companies use the initiative for greenwashing purposes or if they 
fall short on their promises. For the private sector, they may lead to 
competitive disadvantage and loss of profit in the short term if other 
companies do not participate. Nevertheless, with renewable biomass 
and global sustainability at stake, there are strong incentives for novel 
science-business partnerships to emerge in combination with effective 
public policies and improved governmental regulations”. 


Conclusion 


The rate of change of the Earth’s system is accelerating. Unless meaning- 
ful actions are taken within the next decade, we will almost certainly 
face a changed and increasingly unstable climate regime, with seri- 
ous disruptions to the GPE as aconsequence. The current GPE is itself 
a major driver of this change, accounting for nearly a quarter of all 
anthropogenic greenhouse gas emissions over the past decade’. As 
aresult, agriculture, forestry and fishing are increasingly embedded in 
international efforts to tackle climate change. Government policies are 
essential to foster such transformations and align the global economy 
with sustainability goals. In the face of the urgency and complexity 
of this challenge, we also need to explore new spaces for innovation 
and transformation. Although the avenues proposed here are in their 
infancy, they provide potential entry points for transformative change 
and acomplementto effective governmental regulations. Ultimately, 
moving towards amore sustainable GPE is likely to require radical shifts 
in deeply held values, education systems and social behaviour that 
underpin current economic paradigms, consumption patterns and 
power relationships* ”°. Scientists have an important role to play 
in this process. 
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The current trajectory for crop yields is insufficient to nourish the world’s population 
by 2050’. Greater and more consistent crop production must be achieved against a 
backdrop of climatic stress that limits yields, owing to shifts in pests and pathogens, 
precipitation, heat-waves and other weather extremes. Here we consider the potential 
of plant sciences to address post-Green Revolution challenges in agriculture and 
explore emerging strategies for enhancing sustainable crop production and resilience 
ina changing climate. Accelerated crop improvement must leverage naturally evolved 
traits and transformative engineering driven by mechanistic understanding, to yield 
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the resilient production systems that are needed to ensure future harvests. 
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The Green Revolution of the 1960s enabled a steep increase in the 
yields of major staple grain crops (wheat, corn and rice) to address the 
caloric needs of anincreasing global population. This was accomplished 
through elite variety breeding, hybrid crop development, fertilizer 
application and advances in management through substantial public 
investment’. The consequent rise in food security benefitted many 
regions of the world and improved agricultural development (particu- 
larly in India and southeast Asia), reducing poverty and malnourishment. 

By the 1980s, molecular and transformation technologies propelled 
the delivery of the first bioengineered genes into plant genomes. Cur- 
rently, the most widely adopted genetically modified traits are resist- 
ance to herbicides and insects in crops with large markets (maize, 
soybean, cotton and Brassica napus (canola)). Although herbicide- 
and insect-resistance traits greatly lessened soil tillage and insecticide 
use, respectively, they require careful management to avoid natural 
selection of resistance in weeds or pests**. Despite engineered traits 
with clear benefits to farmers and end-users (including virus-resistant 
papaya’, drought-tolerant corn’, rice’ and bananas’ fortified with pro- 
vitamin A, non-browning apples’ and low-acrylamide potatoes”), the 
acceptance of genetically modified traits is equivocal insome countries, 
and the cultivation of genetically modified crops is largely banned in 
the European Union. 

Future food security will require reducing crop losses due to envi- 
ronmental factors, including climate change, as well as transformative 
advances that provide major gains in yields. More recent genomic tech- 
nologies have expedited breeding and trait development for increased 
environmental resilience and productivity. Genetic diversity is now 
readily explored at nucleotide-scale precision, using genome-wide 


association studies and other gene-mapping methods paired with 
advanced phenotyping systems. The identification of loci that con- 
tribute to traits, coupled with molecular-marker-assisted breeding, 
enables the rapid selection of new genetic combinations in elite varie- 
ties. Complementary to breeding approaches, advances in the spatial 
and temporal regulation of engineered genes and pathways are increas- 
ingly accelerated by the targeted editing of genomes using CRISPR-Cas 
technology. A greater understanding of plant mechanisms that increase 
yields in variable environments is essential to drive the necessary gains 
in crop improvement, which can be fuelled by genetic diversity and 
implemented by genome-scale breeding, finely-tuned gene engineering 
and more-precise agronomic management practices. 


Post-Green Revolution challenges 


Despite the marked effect of the Green Revolution on food security, 
there were uneven consequences for human nutrition, the resilience 
of crops to stress, and the environment’. Asian populations benefit- 
ted from the increased production of staple grains, and the adoption 
of irrigation across vast areas*™. The limited rise in food security in 
sub-Saharan Africa and other impoverished areas can be traced to geo- 
graphically skewed support and a lack of investment in orphan crops’. 
Anunintended consequence has been that fruits and vegetables richin 
macronutrients have been displaced by calorie-rich and higher-value 
grain crops in some areas’. Moving forward, an increased production of 
nutrient-rich vegetable, pulse, tuber and cereal crops, and a broaden- 
ing of the global reach of agricultural advances, is necessary to achieve 
food and nutritional security”. 


Climatic stress and disease management 

The increasing frequency of debilitating heat-waves, droughts, tor- 
rential rains and other weather extremes experienced across the globe 
negatively affects agricultural productivity, and is projected to do so! 
(Fig. 1). Climatic constraints can occur independently or together (as 
with heat and aridity), and in either case reduce the level of productivity 
that is predicted for a well-managed environment (the yield potential). 
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Fig. 1| Predicted national-scale yield loss for maize, rice, wheat and soybean. 
a-c, Maps indicate the yield losses caused by aridity stress averaged from 
1950-2000 (a), heat stress averaged from 1994-2010 (b) and nutrient stress in 
2009 (c). National data for each crop were previously compiled”, and are here 
averaged and re-plotted using the maps package in R*. d, Number of large flood 
events from 1985 to 2010" by country. 


Anunanticipated consequence of the development of high-yield varie- 
ties for locations with advanced cultivation practices has been a loss 
of genetic variation that is associated with resilience to suboptimal 
environments. It is imperative to breed crops that carry a diversity of 
resistance genes and/or to plant a diversity of varieties, as this approach 
minimizes the ability of pathogens to overcome resistance“. An increas- 
ing occurrence of extreme weather events, together with dire projec- 
tions of climate change, makes the improvement of crop resilience to 
environmental (abiotic) and pathogen (biotic) stress of paramount 
importance for feeding a growing global population. 


Fertilizer use 


The combination of high-yield crop varieties and the widespread use 
of inorganic fertilizers markedly improved crop production, withclear 
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benefits in terms of food security’. This has translated to excessive 
anthropogenic release of reactive nitrogen” and phosphate” into the 
environment. Inorganic fertilizers have pushed the global nitrogen and 
phosphorus cycles well beyond their estimated safe operating space’®, 
with considerable negative effects on biodiversity, human health and 
the atmosphere”. Their use presents a paradox, as the optimization of 
plant nutrition stabilizes yields and has helped to reduce expansions 
in crop area in light of population growth, yet nitrogenous fertilizers 
contribute substantially to the greenhouse gas emissions that promote 
climate change”. 


Paths forward 

The agriculture of the next decades must satisfy demands for nutri- 
tious food, fibre and animal feed ina highly variable climate, and also 
mitigate the effects of agriculture on the environment. This is a tall 
order. Key to addressing the challenge is a deeper understanding of 
genetic variation and the molecular, cellular and developmental path- 
ways by which plants dynamically respond to and interact with their 
environment and pathogens, while maintaining growth, efficiency of 
nutrient use and fitness. New crop varieties ideally will have genetic 
combinations that alleviate losses from the multiple environmental 
and pest constraints that are encountered during the crop lifecycle 
ina farmer’s field. An important emerging and non-trivial goal is 
to optimize the efficiency of photosynthesis, water and nutrient 
use, including the fostering of beneficial interactions between plants 
and microorganisms that can promote nutrient acquisition. The 
integration of mechanistic understanding, genetic variation and 
genome-scale breeding towards technological solutions will be essen- 
tial. Here we review advances and emerging directions within the 
plant sciences that may bolster yield-defining traits and resilience 
(Fig. 2, Box 1). 


Protection from new and re-emerging diseases 


The reliance on major pathogen-resistance genes bred into crop 
monocultures provides short-term protection against diseases, as 
seen inthe boom-and-bust cycles of resistance over the past century 
and inthe spread of new diseases across continents. A more complete 
molecular-genetic framework now exists for general and specific 
resistance to microbial pathogens mediated by atwo-layered recep- 
tor signalling system (Fig. 2a). At the plant cell surface, families of 
pattern-recognition receptors register the presence of microorgan- 
isms”°. Inside host cells, large panels of nucleotide-binding domain 
leucine-rich-repeat receptors (NLRs) detect activities of invasive 
pathogenic strains”. Advances in elucidating receptor-pathogen 
recognition and activation mechanisms at a protein-structural level 
provide strategies towards the rational design of receptor proteins 
that are tailored to intercept broader or alternative disease agents”. 
Newly engineered resistance traits can then be transferred to elite 
varieties of crops to confer resistance against modern diseases. There 
have been notable successes in the inter-family transgenic transfer 
of a pattern-recognition-receptor gene to potato, tomato, Medicago, 
wheat and rice”, indicating that surface receptors that are restricted 
to particular plant lineages can confer immunity in unrelated species. 
Transfer of the wheat Pm3e resistance gene against powdery mildew 
to a susceptible wheat variety has produced effective mildew resist- 
ance in field trials**. The engineering of pathogen-induced transla- 
tional control of akey Arabidopsis immunity component in rice” has 
provided promising disease-resistance benefits in initial crop field 
trials, apparently without a yield penalty. The incorporation of new 
surface- and intracellular-receptor recognition and signal transduc- 
tion modules into crops is also on the horizon, building on knowledge 
of receptor functional partnerships and resistance network archi- 
tectures””°. Success in this area—especially as climates change—will 
require tight immune-receptor control, which can require co-evolved 
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Fig.2|Pathstoincreased crop yield in suboptimal environments. Overview 
of traits that provide increased resilience and yield in variable environments. 
a, Pathogen recognition by cell-surface and intracellular receptors (resistance 
proteins). Manipulation of host cells by pathogen-secreted effectors to 
promote infection can be recognized by resistance proteins and converted to 
disease resistance. b, Flooding survival via opposing regulation of gibberellin 
(GA). Semidwarf1(SD1), Snorkel1 and Snorkel 2 (SK1/2) confer escape by 
accelerated elongation growth. Submergence 1A (SUBIA) confers tolerance by 
quiescence of growth. c, Root growth towards moisture involves 
transcriptional regulators (indol-3-acetic acid inhibitor protein 3 (IAA3) and 
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auxin response factor (ARF7)), and is regulated by the hormones ABA and 
auxin. d, HKT1 (high-affinity K* transporter sub-family 1) mediates sodium 
(Na*) exclusion from leaves. e, In developing seed tissues, catabolism of T6P 
aids the movement of photo-assimilate carbohydrate (CHO) from leaves to 
sinks in developing florets. f, Optimizing photosynthetic light harvesting and 
CO, fixation by altering photosynthetic protein abundance and minimizing 
photorespiration. PS, photosystem. g, Dynamic control of stomatal aperture 
by pairs of epidermal guard cells lessens desiccation. h, Symbiotic plant- 
microorganism interactions facilitate the uptake of essential nutrients. NH,", 
ammonium; PO, , phosphate; NO, , nitrate. 
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Box 1 


Yield-defining traits and opportunities for crop improvements 


This Review discusses several traits that are essential for crop performance, including the genetic variation and plasticity that are relevant for 
improvement (left) and advanced and emerging approaches for addressing trait improvements (right). Stress resilience coupled with high 
yields is aided by hardwired traits and temporal responses to a dynamic environment. Opportunities for improvement include capturing 
natural genetic variation, the functional characterization of genes, the manipulation of endogenous or transferred genes with appropriate 
regulatory control, the development of low-cost and safe small molecules that can be delivered to plants before stress or during recovery, 
and improved plant health through interactions with symbiotic microorganisms. 


Yield-defining traits 

Shoottraits and plasticity 

Inflorescence architecture and fertility 
Shoot-to-root biomass 

Photosynthesis 

Stomatal movement and density regulation 
Assimilate loading and partitioning 
Senescence timing 

Root traits and plasticity 

Architecture and anatomy 

Growth dynamics 

Nutrient acquisition and use efficiency 
Microbial interactions 

Stress resilience 

Drought, salinity, flooding and extreme temperatures (abiotic) 
Pests and pathogens (biotic) 

Tempered response to minimize growth penalty 


factors”’”®, because modified receptors without requisite constraints 
are often mis-activated and cause necrosis and reduced plant health. 
Knowledge in this arena may also lead to strategies that lessen food 
spoilage by pathogens after harvesting. 


Harnessing resistance from diverse germplasm 
With increased access to diverse genetic variation found in crops and 
natural populations of wild relatives“, the door is open to recovering 
disease-resistance traits; many of these are encoded by genes for pat- 
tern recognition and NLRs that were lost during domestication, or that 
have evolved independently in different plant lineages’. Advances 
in genome sequencing and assembly technologies, coupled with new 
methods for capturing near-complete immune-receptor gene panels 
from complex genomes, hold promise for attaining sustainable disease 
resistance*” **. Merging these approaches with genome-wide associa- 
tion studies taps into immune-receptor gene variants that have adapted 
to local environments and pathogen populations to help to increase 
the resilience of future crops. The stacking (or ‘pyramiding’) of several 
resistance genes with different recognition spectra and environmental 
optima into a single background is nowa credible strategy for achieving 
more durable disease resistance. Nevertheless, assembling appropri- 
ate gene combinations in elite varieties of crops remains a challenge. 
Investigations of plant genomes within and across species provide 
insight into the evolutionary forces that have shaped the architecture 
and function of genes related to immunity. This will aid the design of 
new resistance traits®*. The isolation and characterization of genes 
associated with disease susceptibility in the host has also gained promi- 
nence™. The proactive deployment of modified susceptibility genes 
in crops will become possible as geographical sampling of pathogen 
genomes and populations increases*”*. A recessive barley mildew 
resistance locus o (mlo) mutant that breeders have successfully used 
for 75 years against powdery mildew disease has been engineered into 
hexaploid wheat using mutagenesis by transcription-activator-like 
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Opportunities 

Natural genetic variation 

Stress resilience and recovery mechanisms 

Trait pyramiding 

Gene engineering and editing 

Spatial, temporal and inducible control of genes and networks 
Improving protein function, targeting and turnover 
Enhancing metabolite pathway and flux 
Introducing synthetic traits 

Beneficial soil and leaf microbiome 

Seeding and supplementation 

Attraction of beneficial microorganisms 
Small-molecule delivery 

Response activation 

Metabolic regulation 

Sensor use for crop management 

Cellular, organ, canopy and remote 


effector nuclease, or by combining mutations selected in three wheat 
MLOloci*”*®. Thus, the characterization and manipulation of host-path- 
ogen infection processes for the establishment of disease can generate 
novel resistance mechanisms in crops that are not necessarily found 
in natural populations. As a cautionary note, some newly engineered 
crop lines have displayed unexpected phenotypes and vulnerabilities 
to disease*’, which underscores the need for rigorous performance 
testing of new material in field settings over multiple seasons. 


Pathogen resistance ina shifting climate 

Given the current trends in climate, attaining durable resistance in 
high-yield crops will require a greater knowledge of pathogen popula- 
tion dynamics and plant host responses to temperature. The alarming 
spread of devastating disease agents—such as the bacterium Xylella 
fastidiosa that attacks olives and woody crops in southern Europe, or 
the ug99 strain of stem rust fungus Puccinia graminis f. sp. tritici that 
affects wheat across parts of Africa and Asia—is attributed in part to 
warmer climates, and presents a complex biogeographical and epide- 
miological problem*". Moreover, surface- and intracellular-immunity 
systems appear to respond differently to changes in temperature, owing 
to effects on the microorganisms and hosts that are not completely 
understood’, Although immunity mediated by intracellular receptors 
tends to be less efficient as temperatures increase, some resistance genes 
confer protection at higher temperatures***. Although these findings 
are reassuring, they highlight the need for more genotype x environment 
studies in crops to stabilize resistance in increasingly precarious climates. 


Resilience to abiotic stress 


Abiotic stresses associated with climate change that destabilize yields 
include flooding, drought, soil salinity and extreme temperatures 
(Fig. 1). Resilience mechanisms have been mobilized for crop improve- 
ment through the identification of genes that are associated with key 


traits and signal transduction pathways, followed by breeding or engi- 
neering*®” (Fig. 2b-f). Attaining resilience without affecting overall 
yield is a considerable challenge. 


Flooding 

Floods regularly limit yields*’. Rice is exceptionally resilient against 
flooding, yet over 30% of the acreage cultivated with rice suffers yield 
loss owing to plant submergence*?. SUBMERGENCE 1 (SUB1), identified 
in a flooding-tolerant landrace of rice, encodes a cluster of genes for 
ethylene-responsive transcription factors, including the submergence- 
activated SUBIA-1°°. The SUBIA transcription factor curbs the activation 
of genes that promote the breakdown of leaf sugars and starch (photo- 
assimilate) that would otherwise fuel growth to escape an inundation™ 
(Fig. 2b). The introduction of SUBIA-1, through marker-assisted breed- 
ing, into high-yield varieties now provides an additional week or more 
of submergence tolerance—without compromising yields under non- 
submergence conditions”. 

Thesubmergencetolerance by growth quiescence provided by SUBIA- 
Icontrasts with the accelerated underwater elongation of the shoots of 
varieties of crops that are adapted to progressive seasonal floods in delta 
regions. Deepwater varieties invest photo-assimilate into the extension 
of submerged stem internodes (Fig. 2b). This requires the SVORKEL 1 
(SK1) and SNORKEL 2 (SK2) genes that encode transcription factors that 
are similar to SUBIA”, as well as biosynthesis of the hormone gibberel- 
lin. Gibberellin biosynthesis involves a functional allele of SEMIDWARF1 
(SD1)*, the gene that—when mutated—determines the short stature of 
Green Revolution rice>. This knowledge can improve yields in low-lying 
areas that affected by climate change. 

The alleles of SUBIA, SK1, Sk2and SD1 that are key to flooding survival 
are found in wild Oryza species”, which indicates that they arose in 
ancestral populations in flooded ecosystems. Evolution has modified 
the same growth-response network involving the hormones ethylene 
and gibberellin to achieve submergence tolerance or escape in diverse 
species of wetland plants*’. Pathways to improved flooding tolerance 
include manipulation of root traits associated with waterlogging toler- 
ance that involve a conserved mechanism‘ and the oxygen-dependent 
turnover of SUB1A-1-like transcription factors, accomplished in several 
species” ©°. There are other opportunities to protect yields in wet 
climates. Torrential rain and hail can cause yield losses of 50% or more, 
owing to premature pod shattering in oil crops. The identification of 
genes that control pod shattering in Arabidopsis® enabled the gene- 
targeted molecular breeding of optimized pod-shattering properties 
in canola that is now increasingly planted by farmers. 


Drought 


Drought and other dehydration or osmotic stresses (salinity and cold) 
stimulate the production of the hormone abscisic acid (ABA) in plants. 
Although the mechanisms of the initial sensing of osmotic stress and 
signalling in response to osmotic stress remain poorly understood, the 
elucidation of the ABA receptor and signal transduction mechanisms? 
has exposed new avenues for the enhancement of dehydration toler- 
ance. This includes ABA receptor overexpression®*® and engineering to 
respond to exogenously sprayed small molecules*”, the overexpression 
of signal transduction components” or the drought-drivenrepression 
of negative regulators of ABA signal transduction”. 

ABA closes the adjustable stomatal pores on the leaf surface that allow 
gas exchange and thus reduces the water lost from plants during drought, 
but this response can be weak in crop varieties®”. ABA also helps to 
regulate root growthin response to water availability, including inhibi- 
tion of lateral root growth and enhancement of primary and secondary 
root growth. This developmental reprogramming allows roots to seek 
water. The DEEPER ROOTING (DRO1) gene of rice provides a deep root 
architecture in paddy fields, which bolsters yields under water-limited 


conditions” ® (Fig. 2c). The identification of the major loci that control 
root traits associated with drought resilience has proven challenging 
owing to their quantitative nature and low heritability, which requires 
sophisticated belowground phenotyping and analytical methods”. Yet 
roots grow laterally towards moisture in soil’. New roots that access 
moisture emerge only on the side of a root that is moist, as a modifica- 
tion of a key auxin-response transcription factor on the dry side ofa 
root impedes the developmental program’ (Fig. 2c). Knowledge such 
as this caninform strategies for the advanced breeding and engineering 
of improved resilience to drought, which continues to limit yields””*°. 


Salinity 

Irrigation substantially expands growing seasons and increases crop 
yields in many regions. Salt (sodium chloride) gradually accumulates 
in irrigated soils and is toxic to most crops; sodium accumulation is 
particularly detrimental in leaves. Approximately 40% of irrigated lands 
worldwide are affected by increased salt levels, and expansion of soil 
salinization is a major threat to crop performance”. 

Plants encode a sodium-transporter gene sub-family named HKT1™ 
(high-affinity K* transporter) that provides protection from the over- 
accumulation of sodium inleaves® (Fig. 2d). HKT1 mediates the removal 
of sodium, mainly in roots, from the xylem®**, the vascular conduits 
that transport water and nutrients from roots to leaves. Major quan- 
titative trait loci that enhance salt tolerance in wheat, wheat relatives 
and rice possess distinct HKT1 alleles’, This knowledge has enabled 
the marker-assisted breeding of wheat with a higher salt tolerance, 
resulting in a yield improvement of 25% under salinity stress in field 
trials®. Beneficial alleles of HKT1 may enhance salinity tolerance in 
other species, as has been shown in rice®. It will be necessary to com- 
bine HKTs with other strategies to further boost salinity resistance 
as land salinization continues to rise. The effects of salinity on root 
development also need to be factored into intervention strategies***”, 
Natural variation in transporter genes and their regulation has also 
provided field-tested solutions for other toxic elements, including 
aluminium®* and boron”. 


Extreme temperatures 

Higher atmospheric levels of CO, and other greenhouse gases are pre- 
dicted to increase the frequency and duration of heat-waves”, which 
will lead to losses in crop yield—especially in arid regions”. Sensitivity 
to extreme temperatures varies during the plant lifecycle and across 
species. Low temperatures influence the germination, establishment, 
growth and viability of crops, except for those with temporal chilling 
or freezing resilience (such as winter wheat). Genetic variation in key 
transcriptional regulators of cold resilience is leveraged in breeding of 
several grain crops“. By contrast, warm temperatures promote growth 
until a threshold is reached above which yields precipitously diminish, 
especially when soil moisture is low or humidity is high”**. Sensitivity to 
temperatureextremesis heightened during reproduction, whenit reduces 
male fertility and seed quality”. This presents a daunting challenge as 
protective responses are typically accompanied by reduced yield. Heat 
stressis an expanding threat intropical regions, because, at high humidity, 
plants are less able to cool their leaves by transpiration via stomatal pores 
that control the trade-off between CO, intake and water loss”®. There is an 
urgent need for research and for ensuing genetic and engineered solu- 
tions that preserve crop productivity at increased temperatures (Box1). 


Metabolic control of resilience and yield 

Breeding or engineering plants for a high yield potential in varied and 
variable environments is a potential solution for capturing effective 
resilience. Plants typically dampen growth and accelerate reproduc- 
tive development asa consequence of stress. Yield maintenance under 
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Fig.3| Targets for improving the efficiency of photosynthesis and primary 
carbon metabolism that have experimental support for success. Transgenic 
manipulations of photosynthetic metabolism that lead to improved 
photosynthetic efficiency include (1) improving photosynthesis ina dynamic 
light environment by accelerating recovery froma photoprotected state, by 
overexpressing enzymes (suchas photosystem II subunit S (PSBS) and VDE) that 
are involved in non-photochemical quenching (NPQ) (the dissipation of excess 
excitation energy as heat)"®; (2) altering the CO, response of stomata or the 


moderate drought was significantly improved in AQUAmax corn hybrids 
produced by selective and marker-assisted breeding”. The underlying 
genetic variants and mechanisms that enable these lines to conserve 
soil moisture and delay the accumulation of biomass until grain filling 
remain to be characterized. Higher yields under well-watered condi- 
tions as well as under moderate drought at the time of flowering was 
achieved in corn that expresses a metabolic enzyme that converts the 
low-abundance metabolite trehalose-6-phosphate (T6P) to trehalose 
in the phloem companion cells at the base of the ear and developing 
florets®’ (Fig. 2e). This cell-specific modulation of T6P augments the 
mobilization of photosynthate to the unfertilized floret, and prolongs 
the photosynthetic activity of leaves during grain filling. The spatial 
modulation of T6P levels also regulates the draw of seed reserves into 
young elongating shoots of rice, particularly when dry seeds are sown 
directly into a flooded paddy”. The application of plant-permeable T6P 
analogues to wheat leaves increased seed filling and improved recovery 
from drought’™. These examples illustrate the critical integration of 
metabolism and stress resilience to improve crops that can be provided 
by genetic variation and engineering. 


Optimization of photosynthesis for yield 


Moderncropsare highly efficient at rapidly spreading their leaf canopies 
to maximize light interception, and at partitioning carbon and nutrients 
into seeds. However, crops are not as efficient at converting absorbed 
light energy into sugars through the process of photosynthesis". This 
may be because the proteins and enzymes that mediate photosynthesis 
evolved in a low-light marine environment, which was very different 
from modern agronomic and atmospheric conditions’. However, the 
conservation of chloroplast transmembrane proteins that collect light 
energy and participate in electron transfer reactions within the chlo- 
roplast, along with the conservation of enzymes involved in carbon 
fixation, reduction and regeneration across plant species’ (Fig. 2f), has 
aided the modelling of photosynthesis! and identification of numerous 
targets for increasing its efficiency’ (Fig. 3). Theoretical targets include 
expanding and optimizing light captureby theleaf canopy”, inducing 
amore rapid relaxation of non-photochemical quenching at photosys- 
temIP° increasing the carboxylation capacity of the Rubisco enzyme as 
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density of stomata on the leaf surface to increase the efficiency of water 
use!?°!22,123: (3) increasing the capacity for mesophyll conductance of CO,!%; 
(4) improving the energy efficiency of carbon metabolism by altering 
mitochondrial enzymes”; (5) optimizing investment in light collection’; 
(6) increasing electron flow through the photosynthetic electron transport 
chain"; (7) altering Rubisco properties and activation to increase CO, 
assimilation™*"*°; (8) bypassing photorespiration””; and (9) increasing the 
efficiency of ribulose 1,5-bisphosphate (RuBP) regeneration». 


wellas minimizing oxygenation and photorespiration’”’ 


regenerative capacity of the carbon reduction cycle 
electrontransport chain”, converting crops fromC,toC,metabolism™, 
and adding components of cyanobacterial or algal systems to pump CO, 
or compartmentalize Rubisco. Improving photosynthetic efficiency 
is neither anewnor auniversally accepted idea. Some have argued that 
the selection pressures endured by photosynthesis render it unamena- 
ble to improvement”. Despite decades of research, the challenge of 
engineering Rubisco for improved specificity and carboxylation rate 
remains unmet". However, some recent successes in engineering pho- 
tosynthetic enzymes and introducing novel pathways into chloroplasts 
may lead to substantial gains in crop performance, as outlined below. 

Maize photosynthesis and fresh weight were enhanced by over- 
expressing the small and large subunits of Rubisco, together with an 
assembly chaperone protein™. In wheat, the overexpression of sedo- 
heptulose-1,7-biphosphatase showed increased photosynthesis, and 
resulted in increased plant and grain biomass’. These genetic modifica- 
tions to key crops are promising; their ultimate potential can be tested 
by incorporating the changes into elite varieties, and evaluationin the 
field. Photosynthetic manipulations also show promise in the field inthe 
model plant tobacco. Re-engineering the expression of enzymes that 
control the induction, relaxation and amplitude of non-photochemical 
quenching successfully enhanced photosynthesis during natural light 
transitions, which resulted in 14-20% greater vegetative biomass inthe 
field"°. Even greater gains were observed by inserting enzymes involved 
in glycolate metabolism into chloroplasts to reduce photorespiration’”. 
Coupling this with reduced expression of a glycolate and glycerate 
transporter, to minimize glycolate flux out of the chloroplast, raised 
vegetative biomass by 40% under field conditions”. These studies rep- 
resent fundamental breakthroughs in understanding and engineering 
photosynthesis, which can now move from the proof-of-concept stage 
to expanded field testing. 


, enhancing the 
, optimizing the 


Rising atmospheric CO, and plant water loss 

Crops lose between 100 and >400 water molecules through stoma- 
tal pores in leaves for every carbon atom that is fixed by photosyn- 
thesis, highlighting a fundamental trade-off between carbohydrate 


production and water use. Increases in CO, concentrations inside 
leaves cause a reduction in the size of stomatal-pore apertures”®. 
The continuing rise in atmospheric CO, is increasingly narrowing 
stomatal pores, which can enhance the efficiency of water use by 
crops. However, many crops have weak or non-optimal stomatal CO, 
responses. Advances have been made in understanding the signal 
transduction pathways that regulate water loss in response to CO,"%, 
including that the stomatal CO, response requires amplification by— 
but also includes unique components upstream of and parallel to—the 
ABA response pathway in guard cells” (Fig. 2g). The upregulation of 
the stomatal CO, response by guard-cell-targeted overexpression of 
carbonic anhydrases increased instantaneous water-use efficiency 
by about 44% in Arabidopsis, without a reduction in photosynthetic 
assimilation rates at ambient CO,”°. On the other hand, C, crops grow- 
ing innutrient-rich and water-sufficient humid regions could benefit 
from a weaker CO,-induced stomatal closing response, which may 
enhance growth owing to CO, ‘fertilization’ in an atmosphere with 
an increased concentration ofCO,!. A complete understanding of 
the CO, response pathway is needed to optimize and test water-use 
efficiency and gas-exchange strategies in the field. 

Successful transgenic modifications in barley and rice have shown 
that reducing the density of stomataimproves plant performance under 
water-restricted conditions”. Overexpression of the chloroplast pho- 
tosystem II subunit S protein in tobacco was reported to lower stomatal 
conductance, and increased the efficiency of water use by field-grown 
plants. The effective manipulation of stomatal function will require the 
discovery of the primary CO, and/or bicarbonate sensorsthat control the 
stomatal CO, response, as well as harnessing natural genetic variation 
in stomatal properties that could improve trade-offs between carbon 
gain and water loss ina world with high levels of atmospheric CO,”. 


Technologies to reduce fertilizer use 


Yields of crops are heavily dependent on sufficient nutrition (in par- 
ticular, nitrogen and phosphorus) thatis currently provided primarily 
through the application of inorganic fertilizers. In smallholder farm- 
ing systems, crop productivity is limited by the availability of these 
nutrients». Substantial advances have been made in understand- 
ing the mechanisms of nutrient uptake, transport and use in plants, 
with the aim of improving sufficiency” (Fig. 2h). Balancing the use 
of photo-assimilate with nutrient uptake is critical for optimizing 
yields. The mutations that confer stem shortening in cereals, which 
facilitated the Green Revolution, brought with them unintended inef- 
ficiencies in nitrogen use that can be compensated for by changing 
the balance of transcription factors that control growth and nutrient 
use’, Breeding can also contribute to reducing nutrient imbalances 
through the optimization of rooting systems, nutrient transport activ- 
ity and partitioning’. 

In natural ecosystems, plants frequently engage with beneficial 
microorganisms that facilitate the uptake of limiting nutrients such 
as nitrogen and phosphate’. In agriculture, these beneficial asso- 
ciations are often dampened by supplied fertilizers, because plants 
suppress their interaction with symbionts when they perceive ample 
nutrients. Most plant species associate with arbuscular mycorrhizal 
fungi that greatly expand the root-surface area for nutrient uptake, 
and which actively mine immobilized phosphates from the soil#°"), 
Bringing associations with arbuscular mycorrhizal fungi more effec- 
tively into annual cropping systems with moderate fertilizer use could 
improve nutrient capture, and increase sustainability—particularly 
if the phosphate suppression of mycorrhization could be overrid- 
den. However, applications of strigolactones (the plant-derived low- 
phosphate signals to microorganisms) have so far been insufficient 
to override the suppression of mycorrhization’” and more research 
is therefore needed to obtain benefits from mycorrhizal associations 
in agriculture. 


Engineering the nitrogen symbiosis 
Some plants are colonized intracellularly by nitrogen-fixing bacteria 
that can deliver the complete nitrogen needs of the host plant. Asso- 
ciations such as this are limited to a select group of species, which 
presents an opportunity to radically improve nitrogen availability for 
cereal crops if the symbiosis trait can be transferred. Multiple avenues 
are being explored to achieve this—from equipping plants to associate 
with nitrogen-fixing bacteria to the transfer of the enzyme nitrogenase, 
whichis responsible for nitrogen fixation. Studies of these processes in 
their native context have provided an understanding that was absent 
30 years ago, when such approaches were first broached. Evolutionary, 
genomic and mechanistic studies suggest that relatively few genetic 
components might be needed to confer nitrogen-fixation capabili- 
ties. In the case of transferring nitrogenase to plants, the restriction 
of genetic components required was achieved by concatermerizing 
bacterial genetic units to create a minimal set of three genes that are 
necessary for the transfer of nitrogen fixation’. Moreover, some com- 
ponents of nitrogenase can be stably expressed in yeast and plants’. 
The evolution of the nitrogen-fixing symbiosis in legumes used many 
components that function in associations with arbuscular mycorrhizal 
fungi’, which means that cereals possess some of the necessary build- 
ing blocks and have the potential to streamline engineering efforts 
to transfer the nitrogen-fixing symbiosis. Recent phylogenomic 
approaches indicate that very minimal gene reduction (between two 
and seven genes) is associated with the loss of nitrogen fixation’”°, 
suggesting that a small set of genes could convert a species that asso- 
ciates with arbuscular mycorrhizal fungi into one that can also form 
nitrogen-fixing symbiosis. This considerable engineering challenge 
will require precise transcriptional and post-translational regulation of 
multiple heterologous genes in cereals. Additional challenges are the 
few well-characterized promoters for gene regulation in cereal roots, 
and bottlenecks associated with transformation of cereals that limit 
the scale of throughput required to test the engineering iterations that 
will be necessary to achieve nitrogen fixation. 


Benefits of plant-associated microorganisms 

The environment around and within plant roots includes microbial 
communities. These can be relatively restricted’” or dynamic”®, and 
responsive to nutrient status’. Such communities or community 
members have the potential to protect plants against pathogen infec- 
tions“! and, tosome extent, drought”, A greater understanding of 
the mechanisms and the environmental conditions, including climate 
effects*, that control plant-microorganism assembly and activities 
may enable the engineering of microbial communities to optimize 
crop performance, particularly with microorganisms that are engi- 
neered using synthetic biology approaches. Current research indicates 
that some fungal species benefit host plants by enhancing phosphate 
uptake“, and within the diversity of cereal crops are lines that can 
host active communities of nitrogen-fixers’*. The manipulation of 
microbial associations to improve crop resilience to environmental 
stresses is an area of intense research. 


Prospectus on resilient crops 


Research advances have provided innovative opportunities and tech- 
nologies across the plant sciences, which can furnish solutions for 
addressing future food security (Box 1). The strategies described here 
for enhancing the resilience and sustainability of crops will only be 
realistic if they are part of an integrated approach to agriculture that 
is developed collaboratively with agronomists, engineers and farm- 
ers’. A critical challenge is the time from research discovery to true 
and widespread implementation in agriculture. Some high-impact 
breeding and genetically modified traits (for example, pest resistance 
mediated by individual Bacillus thuringiensis Cry proteins) have spread 
relatively rapidly. However, even in cases that involve breeding into 
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diverse varieties, the time from initial discovery and development to 
broad use has often exceeded ten years”. Regulatory processes and 
intellectual property hurdles associated with technology can lead to 
additional delays in implementation. The robust assessment of varieties 
invariable field environments is essential to timely adoption. In the case 
of submergence-tolerant SUBI1 rice varieties’, cooperation between 
scientists, breeders and farm advisers helped to achieve farmer accept- 
ance and governmental certification within three years of gene charac- 
terization. The visible yield advantage of SUBI varieties after flooding, 
and their lack of differences with the varieties they replaced, was key 
to their adoption. This success contrasts with the failure to provide 
many farmers in climate-vulnerable areas with the services of plant 
breeders to mobilize genetic variation for crop improvement’ and with 
selected or engineered genotypes that did not translate to the field*®. 
Complementary approaches and technologies may provide viable 
opportunities (such as high-protein, salt-tolerant algae that require 
limited freshwater)—although new infrastructure, energy inputs and 
engineering solutions will be needed. 

Valuable genetic diversity for increasing crop resilience resides 
in cultivated landraces, heirloom varieties and the wild relatives of 
crops. Seed banks curate and distribute crop germplasm; the Crop 
Trust (https://www.croptrust.org) is one of the leading efforts to collect, 
conserve and use the approximately 50,000 species of wild relatives 
of crops’°. These seed banks distribute germplasm that can be tapped 
for adaptations to abiotic and biotic stresses, but greater investmentin 
high-throughput genotyping and phenotyping is needed to accelerate 
mapping, the identification of genes and mechanisms, and downstream 
breeding’. 

Addressing yield loss due to climate change, salinity and (re)emerg- 
ing diseases, weeds, parasitic plants and pests requires innovative 
technologies and proactive responses, not unlike the development 
of vaccines and innovations in modern medicine. The integration of 
genetic resources and transformative technologies, from genome 
editing to synthetic biology, are necessary to capture traits that 
increase global food security and reduce the effects of agriculture 
on the environment. An early failure of plant biotechnologists was 
in the lack of effective engagement with environmentalists, farm- 
ers and consumers on questions of health and safety, despite strict 
governmental procedures for the validation, release and monitoring 
of genetically modified crops. It is critical that the specific method 
used for crop improvement does not stymie the implementation 
of safe and effective solutions. Non-politicized regulatory systems 
are essential for scientific advances to scale to farmers within the 
timeframe needed. 

The current timeline for increasing the resilience and sustainability 
of cropsis too long. Crop varieties with new combinations or variants 
of disease-resistance genes are in preparation for use against newly 
emerged virulent pathogens. Advances in sequencing and the early 
detection of invasive pathogenic strains should enable better monitor- 
ing of disease, and therefore knowledge regarding where to deploy par- 
ticular crop genotypes. The horizon for tailored panels of appropriately 
controlled genes that impart functional immunity ina commercial crop 
is years away. The most rapid translation to the field will be for small 
suites of genes from existing crop germplasm. For challenges that are 
difficult to overcome (such as resilience to heat and aridity during 
plant sexual reproduction”), disruptive advances such as the asexual 
propagation of seeds’? could lessen yield loss due to male infertility. 
Success in the engineering of improved photosynthesis, nutrient use 
and beneficial plant-microorganism interactions requires intensive 
investment, but could result in the gains needed. 

The plant sciences have a critical role in meeting the food and fibre 
challenges of the future. Timely investments and research at many 
levels and collaborative efforts are paramount to deploying resilience 
mechanisms and improving the sustainability, yields and nutritional 
value of our crops. 
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Vaccination against infectious diseases has changed the future of the human species, 


saving millions of lives every year, both children and adults, and providing major 
benefits to society as a whole. Here we show, however, that national and sub-national 
coverage of vaccination varies greatly and major unmet needs persist. Although 
scientific progress opens exciting perspectives in terms of new vaccines, the pathway 
from discovery to sustainable implementation can be long and difficult, from the 
financing, development and licensing to programme implementation and public 
acceptance. Immunization is one of the best investments in health and should remain 
a priority for research, industry, public health and society. 
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On 14 May 1796, 73 years before the first issue of Nature, and inspired 
by Lady Montagu’s “variolation” concept, Edward Jenner inoculated 
eight-year-old James Phipps with cowpox pus to prove that the less 
virulent cowpox would protect against smallpox. This experiment was a 
game changer in medicine and health. For the first time, it was possible 
to medically prevent infection ina healthy person. Although vaccines 
have been widely introduced in high-income countries since the late 
1950s, it took 180 years after Jenner before the Expanded Programme 
on Immunization (EPI) was launched in 1974, promoting access to six 
essential vaccines in all countries worldwide. Today, vaccines against 
26 infectious diseases are internationally available according to the 
World Health Organization (WHO), although more have been licensed 
worldwide, changing the future of the human species. Others are in 
experimental public health use, such as Ebola vaccines, or pilot imple- 
mentation such as the RTS,S malaria vaccine, and about 240 vaccine 
candidates are in development? (Table 1). The US Centers for Disease 
Control and Prevention declared vaccination the number one success 
story for public health in the twentieth century’. 

However, progress in vaccine coverage remains highly uneven—both 
between and within countries—which threatens hard-won progress 
and raises uncertainty about howto make further advances. Vaccine- 
preventable diseases such as measles are on the rise, and episodes of 
vaccine reluctance and refusal are occurring globally, questioning 
one of the most transformative interventions for survival and health. 

This Review focuses on preventive immunization in humans and 
its impact (rather than on the vaccines themselves), including in low-, 
middle- and high-income countries. We discuss the current status of 


vaccine coverage, as well as unmet needs, four hurdles to overcome to 
ensure sustainable immunization programmes starting with the dis- 
covery of anew vaccine, the growing issue of vaccine confidence, and 
conclude with several opportunities and needed actions to ensure the 
full potential of immunization for human health and society. Develop- 
mental challenges for vaccine production for low- and middle-income 
countries, which were recently discussed in separate articles*>, and 
therapeutic vaccines are not discussed. 

Vaccines are biological products that induce protective immu- 
nity against infection and disease; they consist of sub-components, 
killed or inactivated organisms or live-attenuated viruses that train 
the immune system for a future response to a natural infection. They 
are probably the only medical intervention that is recommended for 
every single individual on the planet. Unlike therapeutics, vaccines 
are used in healthy people, and demand a very high standard of safety 
and require continuous monitoring for potential side effects. Besides 
considerations of safety, effectiveness, impact and cost, this raises 
complex governance, regulatory and public trust issues. All countries 
have a national immunization plan, often with goals inspired by the 
Global Vaccine Action Plan (GVAP) framework for 2011-2020°. 


How immunization has crucially benefited society 


Itis hard to imagine a world without vaccines. A decade ago, the WHO, 
UNICEF and the World Bank estimated that routine childhood immuni- 
zation programmes were preventing more than 2.5 million deaths every 
year’. With the increase in vaccine coverage, the growth of populations, 
and the introduction of new life-saving vaccines, immunization is ever 
more important for survival. In addition to preventing deaths, vaccines 
prevent disease and disability, including in adults and the elderly. In 
a high-income country such as the United States, for a single birth 
cohort, vaccines prevent nearly 20 million cases of disease, and more 
than 40,000 deaths®. 

A vaccine has for the first time in history eradicated a human dis- 
ease, smallpox. Efforts to eradicate polio are in the final stages, with 
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Table 1| Historic timeline of introduction of vaccines 


Year Disease Year Disease 
798 Smallpox 1992 Japanese encephalitis (mouse brain) 
885 Rabies 993 Cholera (recombinant toxin B) 
896 Cholera 994 Typhoid (Vi) polysaccharide 
896 Typhoid 994 Cholera (attenuated) 
1897 Plague 995 Varicella 
923 Diphtheria toxoid 996 Hepatitis A 
1926 Pertussis (WC) 996 Pertussis (acellular) 
926 Tetanus toxoid 998 Lyme OspA 
927 Tuberculosis (BCG) 999 Meningococcal conjugate (group C)* 
935 Yellow fever 1999 Rotavirus (reassortant) 
936 Influenza 2000 Pneumococcal conjugate (7-valent)® 
937 Tickborne encephalitis 2003 Influenza (intranasal, cold-adapted) 
938 Typhus 2005 Meningococcal conjugates (4-valent)? 
955 Polio (inactivated) 2006 Human papillomavirus recombinant (4-valent) 
963 Measles 2006 Rotavirus (attenuated and new reassortants) 
963 Polio (oral) 2006 Varicella Zoster 
967 Mumps 2008 Rotavirus (monovalent) 
1969 Rubella 2009 Japanese encephalitis (Vero cell) 
970 Anthrax secreted proteins 2009 Cholera (WC only) 
974 Meningococcus polysaccharide 2009 Human papillomavirus recombinant (2-valent) 
977 Pneumococcus polysaccharide (14-valent) 2010 Meningococcal type A conjugate (monovalent) 
980 Adenovirus 2010 Pneumococcal conjugate (13-valent) 
980 Rabies (cell culture) 2014 Human papillomavirus (9-valent) 
981 Tickborne encephalitis 2014 Meningococcal type B (fH factor) 
981 Hepatitis B (plasma derived) 2015 Ebola (unlicensed)? 
983 Pneumococcus polysaccharide (23-valent) 2015 Malaria® 
985 Haemophilus influenzae type b polysaccharide 2015 Dengue 
986 Hepatitis B surface antigen recombinant 2015 Meningococcal type B4 
987 Haemophilus influenzae type b conjugate* 2016 Cholera (oral) 
989 Typhoid (Salmonella Ty21a) 2018 Typhoid conjugate* 
991 Cholera (WC-rBS) 
Table adapted from Plotkin & Plotkin (2018). The year of licensing is indicated wherever possible. rBS, recombinant B subunit; WC, whole cell. 


*Capsular polysaccharide conjugated to carrier proteins. 


An investigational vaccine, rVSV-ZEBOV, was used under ‘expanded access’ during the Ebola outbreak in West Africa in 2015 and the 2018-2019 outbreak in the Democratic Republic of 
the Congo; the Ad26.ZEBOV/MVA-BN-Filo vaccine was used in 2019 in Rwanda and the Democratic Republic of the Congo. 
°Positive opinion from the EMA under article 58 issued in 2015. Approved for routine use in pilot implementation settings in Ghana, Malawi and Kenya in 2018. 


Reverse vaccinology. 


only two countries, Afghanistan and Pakistan, still experiencing wild 
transmission of the polio virus. All countries with the exception of 13 
have eliminated neonatal and maternal tetanus. Without vaccination, 
there would be far more infections that require antibiotic therapy, 
exacerbating the major problem of drug-resistant infections. 

Between 1990 and 2017, immunization contributed to a55% global 
decline in under-five mortality rates, with a drop from 87 to 39 deaths 
per 1,000 live births’. More than 14 million deaths are estimated to 
have been prevented by measles vaccination alone between 2011 and 
2020°. 

Vaccination benefits not only those who are vaccinated, but also oth- 
ersin their family and community. This population-wide benefit, known 
as ‘herd immunity’, reduces the exposure of unvaccinated individu- 
als to pathogens through a reduction or interruption of the chains of 
transmission. A recent study in Kenya showed that the introduction of 
apneumococcal vaccine resulted in not only a major reduction ininva- 
sive pneumococcal disease, but also a nearly 100% decline in incidence 
among infants too young to be vaccinated, and amore than 74% reduc- 
tionamong unvaccinated children’. Community or herd immunity is an 
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important consideration when estimating the full public health value 
of immunization. The threshold to achieve such community protection 
can beas highas 95% for measles, but as low as 80% for rubella, and 60% 
in high-income settings for the effect to begin for pneumococcal vac- 
cination, which means that the programme strength required to derive 
additional impact varies substantially by vaccine” ’. These differences 
inthe required critical vaccination coverage rates are due to the basic 
reproductive ratio of an infection (R,)“, which can vary greatly among 
various infectious diseases. The R, of a specific infection indicates the 
average number of cases one case generates ina population—in the case 
of measles it is 12-18, whichis among the highest”. It is an indicator of 
how contagious an infection is, and determines the minimum level of 
vaccination coverage needed to generate herd immunity. 

Potential long-term effects beyond direct protection against a specific 
pathogen or disease have been attributed to several vaccines, in particular 
the BCG vaccine against tuberculosis and the measles vaccine, in which 
observational studies suggested a survival advantage compared with 
children who had remained unvaccinated. These non-specific effects (also 
knownas heterologous effects) would add to the disease-specific, proven 


Table 2 | Vaccines across the human life cycle 


Recommended immunization schedule 


Vaccines 


Life cycle stage 


Newborns 


BCG; hepatitis B; polio. 


Infants/toddlers 


Diphtheria; tetanus; pertussis; polio; Haemophilus influenza type b; hepatitis B; influenza; pneumococcus; 
rotavirus; malaria; meningococcus; varicella; measles; mumps; rubella; typhoid; yellow fever. 
Under development: RSV; Salmonella spp.; Shigella spp.; ETEC. 


Older children and adolescents 


HPV; influenza; meningococcus; diphtheria booster; tetanus booster; pertussis booster. 
Under development: group A streptococcus. 


Adults 


Influenza; diphtheria booster; tetanus booster; pertussis booster; varicella; HPV (depending on age at initial 
vaccination). 


Pregnant women 


Tetanus; influenza; pertussis. 
Under development: group B streptococcus; RSV; CMV. 


Older adults (265 years) 


Influenza; diphtheria booster; tetanus booster; pertussis booster; pneumococcus; shingles. 


Special health conditions (adults) 


Immuno-compromised (including HIV 
infection) 


Influenza; pneumococcus. 


HIV infection? 


Influenza; pneumococcus; hepatitis B; meningococcus. 
For CD4 count 2200 cells per ul: measles; mumps; rubella; varicella. 


Asplenia, complement disorder” 


Influenza; pneumococcus; meningococcus; Haemophilus influenza type b. 


Chronic kidney disease (including 
haemodialysis? 


Influenza; pneumococcus; hepatitis B. 


Chronic liver disease” 


Influenza; pneumococcus; hepatitis A; hepatitis B. 


Diabetes? 


Influenza; pneumococcus; hepatitis B. 


Heart or lung disease” 


Influenza; pneumococcus. 


Other circumstances 


Travel 


Hepatitis A; hepatitis B; typhoid, rabies; yellow fever; Japanese encephalitis; cholera; meningococcus; malaria. 


Healthcare workers 


Hepatitis B; influenza; measles; mumps; rubella; varicella; diphtheria; tetanus; pertussis; polio; BCG. 


Not only do vaccines provide important health benefits for all stages in life, but they also provide benefits for travellers, healthcare workers and individuals with existing health conditions. Note 
that the lists of vaccines are illustrative only, rather than exhaustive, and do not indicate that these are universally recommended for each life phase in all countries. Routine vaccines recom- 
mended by the WHO are available at: https://www.who.int/immunization/policy/immunization_tables/en/ (last accessed 3 September 2019). BCG, Bacillus Calmette-Guérin; CMV; cytomegalo- 


virus; ETEC, enterotoxigenic Escherichia coli; RSV, respiratory syncytial virus. 


*The following vaccines are recommended for these conditions in the United States: diphtheria booster; tetanus booster; pertussis booster. 


'The following vaccines are recommended for these conditions in the United States: measles; mumps; rubella; varicella. 


benefits of vaccines, and have been attributed to epigenetic changesin 
innateimmune cells as opposed to the adaptive immunity induced by the 
antigen-specific responses to the vaccine’. However, the importance of 
heterologous effects remains controversial, and plausible immunological 
findings still need to be validated in large-scale clinical trials. 

The benefits of vaccines in general go beyond health, and include 
economic, educational, health security and other benefits’. Their full 
economic value is not sufficiently quantified in assessments of cost- 
benefit, or ininvestment terms, andis an increasing area of inquiry and 
empiric measurement”. 

Vaccination is asound investment. Thus, the return on investment 
from childhood immunization in low- and middle-income countries 
is high. For every US$1 invested in immunization against ten diseases, 
$16-$18 are saved in healthcare costs, and the net return is as high as 
$44 per dollar spent when the broad economic benefits are considered, 
althoughthe return ontheinvestment varies by individual vaccine”. This 
is compared with the cost per DTPcv3-vaccinated child of $27 (having 
received all three doses of diphtheria-tetanus-pertussis (DTP)-con- 
taining vaccine)”. In the United States, the net economic benefits of 
vaccination in one birth cohort amount to almost $69 million”. 

Modelling and observational data suggest that in low- and middle- 
income countries, vaccination contributes to the alleviation of, and 
protection against, poverty. Financial risk protection provided by the 
benefits of vaccination are accrued by the poorest households by the 
reduction of catastrophic and impoverishing health expenditures””*. 
There is also evidence that vaccination improves childhood physical 
development, educational outcomes, and equity in distribution of health 
gains”. Finally, without vaccines, absenteeism from school and work 


would be much higher, and periodic epidemics would disrupt society. 
The economic effects of periodic influenza epidemics, for example, are 
enormous”*”’, and can be reduced by immunization”. 


Vaccination is a lifetime investment 

In addition to being the backbone of maternal and child health, vac- 
cines provide important health benefits for all stages in life (Table 2). 
Given adaptations of the immune system throughout life, not all 
vaccines work equally well at all stages of life or in all geographical 
regions?°”!, 

Starting in infancy, the presence of maternal antibodies in the new- 
born can impede the response to vaccines, as the neonatal immune 
system undergoes its own journey of ontogeny, which enables it to 
adapt from the ‘sterile’ in utero environment tothe confrontation with 
colonizing and potentially pathogenic microorganisms”. Particular 
immunological pathways have been identified®. 

Despite considerable progress in reducing the rates of under-five 
mortality, important gaps remain in addressing neonatal morbidity and 
mortality. Neonates are particularly vulnerable to infection with Gram- 
negative bacteria and group B streptococcus, for which no neonatal 
vaccines currently exist®**. The gap in early protection can potentially 
be bridged by administering vaccines to women in pregnancy, relying 
on passively transferred antibodies to protect infants in the first few 
months of life, until vaccinations administered in infancy or later can 
provide protection. On the basis of this principle, tetanus, influenza 
and pertussis vaccinations are recommended for pregnant women to 
prevent neonatal infections such as neonatal tetanus”. This maternal 
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Table 3 | From discovery to sustainable effect of immunization: overcoming four major hurdles 


Issues 


Selected actions needed 


First hurdle: from discovery to early 
clinical development *High risk for companies 


+Safety key issue 


+Few discoveries make it to actual products 


«Incentives for industry for vaccines with no market in 
high-income countries 
*Public-private partnerships and philanthropy 


Second hurdle: from early clinical 


development to large efficacy trials development 


Particularly challenging for vaccine candidates without high- 


income market potential 


*Safety major issue, besides immunogenicity and efficacy 


*Complex road to licensing 
*Can take 3-10 years or longer 


+Very expensive—two-thirds of total costs of new vaccine 


+End-to-end product planning need for major boost from 
private and public funding 

*Clinical trial capacity and rationalizing trial 
methodology 

*Regulatory harmonization and speed 

*Manufacturing availability for GMP products to be used 
in trials 


Third hurdle: from vaccine licensure 


to broad scale implementation deliberations and political priority 


*Country capacity to take on new vaccines; that is, human and 
financial resources and the time to build political support and 


community demand 


+Logistical issues—for example, cold chain, procurement 
management, organization of vaccination to ensure equity of 


access 
«Supply not always sufficient 
Highly variable timeline by country 


*Dependent on policy recommendations, cost-effectiveness 


*End-to-end product solution 

+*National and international funding, Gavi transition 
management, tendering processes 

+National regulatory harmonization 

Policy clarification and political leadership 

+Manufacturing capacity 

*Research on full societal value of vaccine assessment, 
implementation research and relevant cost- 
effectiveness models 

+Equity of access 


Fourth hurdle: achieving consistent, 
long-term supply and demand 
sustainability 


programme 


+Never ending 


*Continuing concern for every national immunization 


«Issues may arise even after years of implementation 
«Complex interplay of service delivery, supply and demand, 
societal trust, political and humanitarian conflicts 


«Policy and political commitment 

+» Sustainable funding 

*Management and logistics 

«Tender processes 

+Manufacturing capacity 

*Good communication, safety surveillance and vigilance, 
including promptly addressing safety signals and signs 
of vaccine hesitancy 


immunization strategy may be expanded with promising vaccines 
against group B streptococcus and respiratory syncytial virus”*. 

For adolescents, life-saving vaccines against human papilloma virus 
(HPV; the cause of cervical, anal, penileand head and neck cancers) are 
being increasingly introduced and need to be administered before the 
likely acquisition of HPV via sexual contacts. Vaccines against menin- 
gococcal meningitis—a potentially lethal infection with a second peak 
in adolescence—have also been introduced into this age group insome 
countries. New platforms suchas schools had to be engaged to admin- 
ister these vaccines. 

Outbreaks of mumps have very occasionally been seen in teenagers, 
despite a solid vaccination record. This highlights the need for surveil- 
lance of all age groups for disease outbreaks, and could be due to wan- 
ing of protection induced by vaccines that are otherwise regarded as 
highly efficacious”. 

Booster vaccines against diphtheria, tetanus and polio are required to 
guarantee long-lasting protection and are required throughout adult- 
hood to maintain protective immunity levels—although recommenda- 
tions may vary by country. 

Alife-course approach to vaccination has become ever more pressing 
with pneumonia, influenza and shingles differentially affecting older 
adults, and death rates from pneumonia and influenza 130 times higher 
for adults over 85 than for younger adults*®. Vaccination of the elderly 
with existing vaccines could prevent up to 90,000 deaths per year in 
the United States alone”. 

Adultimmunizationdoesnothaveaclearprioritizationinlow-andmiddle- 
income countries, and is acomplex programme across high-income 
countries. Itis different from paediatric immunization, which has a global 
programme and focused, substantial funding. As the demographics are 
shifting across the world to an older distribution, a focus on adult immu- 
nization will become increasingly relevant, as advocated by the World 
Coalition on Adult Vaccination”. Despite national recommendations*, 
vaccine coverage among adults in high-income countries is uneven* (vac- 
cine coverage for herpes zoster, which causes shingles, among adults 
aged 60 or over inthe United States was 24% compared with 65% for influ- 
enza among those aged 65 or over), and very low or not even available 
in most low- and middle-income countries“. Yet, several studies have 
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shown good cost-effectiveness of adult vaccinations against influenza, 
pneumococcalinfection, shingles, HPVandtetanus-diphtheria-pertussis™. 

Important gaps also exist in our understanding of the fundamental 
biology of adult immunization. Owing to ‘immunosenescence’—the 
gradual decline of theimmune system associated with ageing—vaccina- 
tion of older adults is in general not as effective as in younger people, but 
the reasons for poorer responsiveness are not well defined, and require 
anew effort in terms of strategies and products for immunization of 
adults. However, it is likely that several compartments of the immune 
system are affected*’. 

There are three areas in which alterations to increase vaccine efficacy 
in the elderly could be considered: (i) increased vaccine potency; (ii) the 
use of adjuvants to enhance immunity; and (iii) application of immune 
modulators or other interventions to alter host immunity generally. 

As populations age across the world, it will be increasingly important 
toidentify how to integrate immunization programmes in health and 
care services to reach all age groups. 

In addition, vaccinations are needed for travel, particular professions 
or specific health conditions” *, and international travel has hada role 
in the resurgence of measles in areas such as the United States™. 


From discovery to impact: four hurdles to overcome 


Thereare still major infectious diseases that required an effective vac- 
cine for control and ultimate elimination, such as HIV infection and 
tuberculosis. Therefore, the continuing development of new vaccines is 
apublic health imperative. Unfortunately, most early vaccine candidates 
in the discovery phase never make it as a safe and effective product. 
Development and deployment of vaccines is a long and complex pro- 
cess. We briefly describe here four hurdles that need to be overcome 
from the discovery phase of a new vaccine to sustainable population 
impact (Table 3). 

The first hurdle is a ‘valley of death’ from discovery to early clinical 
development, when a potential antigen, adjuvant or new vaccine for- 
mulation developed inthe laboratory is further tested for clinical proof- 
of-concept and safety in humans, in addition to optimizing production 
elements. Real progress has been made in recent years owing to several 


public and private initiatives that are helping partly to overcome this 
first major challenge, such as the Coalition for Epidemic Preparedness 
Innovation (CEPI)*?, which was created after the 2014-2015 Ebola epi- 
demic in West Africa to accelerate the development of vaccines against 
epidemic pathogens”*™. 

The second hurdle in vaccine development, also referred to as the 
‘second valley of death’, relates to the shift from early clinical devel- 
opment to the large and very expensive efficacy trials most often 
needed‘, unless a previous similar vaccine is already developed anda 
new product can be licensed using an established correlate of immu- 
nity or protection. This is also the most expensive phase of vaccine 
development, absorbing more than two-thirds of the total costs of 
development of anewvaccine, including the building of special manu- 
facturing facilities and conducting phase 3 trials in several countries, 
ideally with independent research partners. Often, this major financial 
effortis beyond the means of smaller biotech companies, andin general 
only big pharmaceutical companies and large foundations or public 
institutions have the financial bandwidth to support such trials that 
can cost as muchas hundreds of millions of dollars. For vaccine 
candidates without a prospect of a high-income market to ensure 
a return on investment, and when the potential market for the new 
vaccine is limited to low- and middle-income countries, there is an 
almost unsurmountable valley of death unless philanthropic and public 
funding intervene’. 

The needs and unique challenges of vaccines against epidemic patho- 
gens demand innovation in product development pathways. The Merck 
recombinant vesicular stomatitis virus—Zaire Ebola virus (rVSV-ZEBOV) 
vaccine was deployed ona large scale during the recent Ebola outbreak 
in eastern Democratic Republic of the Congo before the product was 
licensed—even for indications for which no efficacy data were available 
suchas primary prevention in healthcare workers. A second experimen- 
tal vaccine, Ad26.ZEBOV/MVA-BN-Filo, is now also deployed for the 
same outbreak and in Rwanda®. Well-informed country leadership and 
transparent governance of suchuse are crucial, as is genuine community 
involvement. The ‘animal efficacy rule’ that applies when efficacy trials 
in humans are not feasible or ethical should also be considered for vac- 
cines against epidemic pathogens. The development of Ebola vaccines 
has shown how this type of ‘learning by doing’ model can offer early 
access in humanitarian situations®*”, although it should be stressed 
that nearly five years after the first Ebola vaccine clinical trials in West 
Africa, no Ebola vaccine is licensed despite well-documented immuno- 
genicity, safety, and human and/or non-human primate efficacy data. 
When acrisis suchas Ebola is no longer the headline news, the sense of 
urgency is lost, and regulators and normative committees go back to 
often extraordinarily long processes. 

After asuccessful phase 3 trial, there is acomplex path to the licensing 
of any new vaccine, which requires reproducibility and safety tests of 
several batches of vaccines, while manufacturing facilities are finalized. 
Many countries still request clinical trial data conducted locally, delay- 
ing country licensing and implementation considerably, while further 
raising the costs of development. In Europe, there is advanced harmo- 
nization inthe regulatory approval of vaccines through the European 
Medicines Agency (EMA), andin sub-Saharan Africa, the Africa Vaccine 
Regulatory Forum (AVAREF) is aiming to strengthen regulatory capacity 
for clinical trials and harmonization of regulatory practices®. 

Following all of these activities, which can take as long as ten years or 
more, anewvaccine is now ready for deployment, but a third hurdle can 
occur between the licensing of a vaccine and broad-scale implementa- 
tion, which is dependent on botha policy recommendation and the abil- 
ity to implement. Many years can go by before important new vaccines 
reach communities in need, the cost of whichis measured in human lives 
that could have been saved as well as money for their development. 

There are many contributors to this third hurdle: first is cost, which 
is especially relevant for countries that are neither wealthy enough 
to procure vaccines at high cost nor poor enough to receive funding 


assistance from Gavi, the Vaccine Alliance. However, when a Gavi-eligi- 
ble country transitions out of the programme owing to an increase in 
its gross national income per capita, it needs to increasingly mobilize 
domestic resources or other development assistance’’. Even when the 
broader value proposition of anew vaccine is substantial, there remains 
the question of affordability. Second is the question of country capacity 
to take on newvaccines; the past decade has been aremarkable era for 
vaccine introduction, with 113 countries having introduced at least one 
new vaccine, which represents a real success story. Country capacity 
to introduce and sustain ever growing programmes involves human 
and financial resources, and time to build political support and com- 
munity demand. Both the pneumococcal conjugate and the rotavirus 
vaccines now have coverage in low-income Gavi countries that meets 
or exceeds the global average; however, this reflects the fact that not 
all countries in any income strata have yet introduced these vaccines 
in spite of their availability”. Even high-income countries can experi- 
ence delays. Thus, inthe United Kingdom, a meningococcal B vaccine 
was licensed in January 2013, recommended for introductionin March 
2014, and finally announced for introduction in May 2015. It then took 
more than 12 months to resolve procurement discussions to enable 
implementation™. 

For products that address priority diseases for low-income countries, 
the uncertainty of the market may risk products collapsing unless a full 
end-to-end product solution is articulated, with non-commercial sup- 
port. Inclusion of the new vaccine in the WHO’s pre- qualification listis a 
requirement for procurement through funders suchas UNICEF and Gavi. 
Some of these are vaccines against parasitic diseases, which are much 
more complex than bacterial or viral vaccines owing to the wide range 
of antigens with often a complex life cycle that exhibit different antigens 
relevant for vaccine protection. Thus, the RTS,S vaccine—the first ever 
malaria vaccine used ina routine immunization system®—took nearly 
30 years since its creation by GlaxoSmithKline in 1987“ before the EMA 
issued a positive scientific opinion in 2015, and the WHO recommended 
large-scale pilot programmes in 2016. These programmes took another 
three years to start in several African countries, and demonstrate the 
sometimes incredibly long development, licensing, and introduction 
times. The RTS,S malaria vaccine is also an example of a vaccine for which 
theclinical trial performance of partial protectionled toa policy decision 
to advance ina step-wise manner rather than full programmatic deploy- 
ment. This may become a more common pathway for future products, 
in part because these vaccines have performance and implementation 
characteristics that are more complex than those of current vaccines. 

We are entering an era in which the path from vaccine licensing to 
routine implementation requires more than safety and efficacy data. 
Policy recommendations for new vaccines may only be realized after 
implementation research to determine howto ensure use and impact 
most effectively. Deliberations about cost effectiveness, the full value 
of vaccine assessments, and country priorities in the face of constrained 
resources remain drivers for delays associated with the third hurdle. 
National Immunization Technical Advisory Groups (NITAGS) will be 
increasingly important to guide evidence-based decision making. 

Even after the lengthy and costly trajectory to introduce a new vac- 
cine, ensuring sustainable impact faces a fourth set of hurdles that need 
to be overcome. These include supply and demand sustainability, and 
resilience and acceptance of immunization. Logistical issues such as 
the in-country ‘cold chain’ system of transporting and storing vaccines 
at recommended temperatures, procurement management, and the 
organization of vaccination clinics in remote areas, vaccine hesitancy, 
and equity of access can all present challenges. In addition, the misuse of 
vaccination campaigns as political tools has seriously damaged vaccine 
confidence in areas suchas the Philippines, Nigeria, Afghanistan, Italy 
and Pakistan®. Some side effects or limitations of duration of protec- 
tion may only become obvious after larger scale use, suchas for live oral 
rotavirus vaccinationinhigh-mortality settings”, pertussis vaccine” and 
others®. A recent exampleis the results froma retrospective analysis of 
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Fig. 1| Coverage of DTP3 immunization over time globally, combining 
coverage and regional variations, 1980-2018. The coverage of DTP3 
(containing products) immunization improved rapid in the 1980s, with large 
regional variations. Stagnation over the past 10 years has meant that 19.4 million 
children remained under-vaccinated or unvaccinated. The thick redline denotes 


long-term efficacy trials that show that although there is a clear overall 
population benefit of the Dengvaxia vaccine against dengue, the vaccine 
also caused an excessive risk of severe dengue in seronegative vaccinees 
(that is, those not exposed to dengue virus”). In the Philippines, this 
new risk was reported after more than 800,000 school children were 
vaccinated, prompting a marked reaction by the public in 2018”. 

Stock-out events and vaccine manufacturing capacity have been 
problematic for particular vaccines, even in high-income countries. 
Manufacturers emphasize the time needed to build and commission a 
factory”. Although manufacturers in middle-income country are now 
supplying most low-cost vaccines globally, they face low profit margins, 
ferocious tenders, and often unpredictable procurement schemes. More 
efficient and modular production technologies may enable decentral- 
ized production with lower capital costs. 

Each of the four hurdles can be overcome, although the fourth one 
should bea continuing concern for every national immunization pro- 
gramme. Depending on the phase, they may require different sets of 
policy actors, and are sometimes a matter of policy, management and 
leadership, rather than money. 

Throughout the development and use of vaccines, vaccine safety is an 
overriding concern, and requires a continuous and careful scientific and 
societal assessment. Safety monitoring during manufacturing typically 
occupies a major part of the process and costs of a vaccine, andisakey 
element of any vaccine programme. In specific high-income popula- 
tions, suchas in the elderly, personalized medicine approaches have 
been proposed to maximize both immunogenicity and safety in the 
presence of chronic conditions and changes related to older age, but 
large-scale applicability is still questionable at present”. 


Persistent unmet needs for vaccination 


The extraordinary achievement of vaccines is reflected in countries hav- 
ing vaccinated more than 116 million infants in 2018 alone”—which rep- 
resents the largest number ever—and acomparable number of infants 
were also estimated to have been vaccinated in 2017. The global and 
regional coverage of diphtheria—tetanus-pertussis (DPT3) vaccina- 
tion between 1980 and 2018 in Fig. 1 shows overall high coverage with 
regional variations, but also some stagnation in coverage over the past 
10 years”. Despite the high coverage, there still remained 19.4 mil- 
lion under-vaccinated or unvaccinated children, who were vulner- 
able to diseases that they could and should have been protected from. 
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global coverage, and solid lines represent regional coverages. DTP3, diphtheria, 
tetanus and pertussis; AFR, AMR, EMR, EUR, SEAR and WPR are WHO sub- 
regions of Africa, Americas, Eastern Mediterranean, Europe, South-East Asia 
and Western Pacific, respectively. Adapted from ref. °. Source: WHO/UNICEF 
coverage estimates 2018 revision, July 2019. 


Substantial improvements in coverage have been achieved in some 
countries, whereas coverage is regressing in others, often because 
of social disruption, conflict, or political upheaval, which highlights 
the extremely dynamic nature of vaccine programme performance. 

Around 60% of all children who did not receive basic immunization 
in 2018 live in ten countries: Angola, Brazil, the Democratic Republic of 
the Congo, Ethiopia, India, Indonesia, Nigeria, Pakistan, the Philippines 
and Vietnam”. To achieve rapid change in this situation requires the full 
commitment of governments, supported by international organizations. 
The Gavi Alliance provides funding for vaccination programmes in low- 
and low- to middle-income countries, and has had substantial impact. 
The technical support provided by Gavi partners will be essential to 
address persistent gaps in vaccine coverage. Consistently delivering 
vaccines with high coverage, reaching at least the minimum coverage 
required to achieve herd immunity in line with the basic reproductive 
ratio of an infection as mentioned above, remains a struggle in many 
other countries including in middle- and high-income settings, with 
poor children not being reached”*””. For example, in 2017 inthe United 
States, 100,000 children under the age of two (1.3% of the population of 
that age) were notimmunized against DPT and MMR (measles, mumps 
and rubella), which represents a fourfold increase since 2001”°*°. 

Of particular concern are countries in which vaccination coverage 
has declined. There are 19 countries that had more than 80% coverage 
for first-dose measles at some point between 2011 and 2017, but with 
coverage in 2018 at least 10% lower than their peak coverage. The measles 
vaccine coverage of those 19 regressing countries now ranges from 38% 
to 88%, with 10 countries with well below 80% coverage”. Some of the 
regression on vaccine coverage may represent improvements in data 
rather than actual slippage in coverage. The data systems to monitor 
boththe number of children bornandthe number of children vaccinated 
accurately are highly variable in quality”. In some settings, manage- 
ment and reward systems probably incentivize inaccurate reporting 
of coverage data to meet targets, rather than incentivizing accurate 
reporting. 

Outbreaks of measles, diphtheria and yellow fever are the result of 
what happens when theworldis complacent and immunization coverage 
declines. Diphtheria outbreaks surged in Russia in the early 1990s; out- 
breaks of meningitis occurred among Rohingyarefugees from Myanmar 
inrefugee camps in 2017; and the transmission of polio persists in parts 
of Afghanistan and Pakistan®’. Measles outbreaks are occurring in all 
regions of the world. The recent 80-fold increase in reported measles 


cases inthe WHO European Region over four years to more than 82,000 
cases in 2018 with 72 deaths***is a result of amixture of vaccine refusals, 
cultural beliefs, and access issues that include interruptions in vaccine 
supply, suchas in Ukraine®, and have led to a WHO declaration ofagrade 
2 health emergency”. In the Americas, thousands of cases have been 
reported in Venezuela owing tothe political and economic crisis, with 
cases also appearing in Brazil, Colombia and Ecuador, and four countries 
inthe WHO European Region (United Kingdom, Albania, Greece and the 
Czech Republic) have now lost their measles elimination status. The 
United States is also at risk of losing their measles elimination status. 

These outbreaks reflect failures to achieve and maintain high vaccina- 
tion coverage, community by community. Low vaccination coverage 
and high heterogeneity in coverage are most deeply seen among African 
countries where routine rates of immunization in many countries are 
well below the GVAP targets®. 

Since 2010, routine immunization levels have either stagnated or 
decreased in 54 out of 85 middle-income countries, who do not qualify 
for support from the Gavi Alliance’. Vaccine expenditures per child 
are often lower in middle-income countries than in low-income Gavi 
countries. The issue may not be solely due toa lack of funding capabil- 
ity, but may also arise owing to a lack of prioritization of immunization, 
countries not participating in pooled procurement mechanisms such 
as via UNICEF, low volumes of vaccines, insufficient efforts to reach vul- 
nerable populations, vaccine choices, and duplicative local regulatory 
requirements that delay the introduction of new vaccines. 

Another unmet need concerns the introduction of new vaccines. 
Rapid progress has been made to scale up the introduction of vaccines 
through Gaviinvestments in low-income countries, but not all vaccines 
have progressed at the same rapid pace. The adolescent HPV vaccine 
has been particularly slow to be introduced outside of high-income 
settings because of programmatic challenges, public-access issues, 
supply constraints and pricing issues. 

Addressing these unmet needs will require persistent implementa- 
tion of strategies that have been shown to be effective—such as detailed 
microplanning of local efforts to assure all children are identified and 
immunized—and special campaigns and approaches suchas drone deliv- 
ery of vaccines in areas that are harder to reach®’. Systematic evaluation 
and implementation research should be part of these efforts to develop 
afirm evidence base for overcoming such programmatic challenges. The 
WHO has elaborated guidance onimplementing high impact immuniza- 
tion programmes (Global Routine Immunization Strategies and Prac- 
tices, GRISP) to address these unmet needs. Middle-income countries 
that do not benefit from funding from the Gavi Alliance need procure- 
ment mechanisms that can secure more predictable tiered pricing. No 
set of strategies, however, will succeed without substantially enhanced 
domesticinvestment and local political commitment, which continue to 
limit progress in many parts of the world. As demand for services from 
communities increases, responsiveness to that demand from govern- 
ments, the funder of such services in most countries, is more likely”’. 

In addition to the unmet needs related to existing vaccines, nearly 
half of all deaths from infectious diseases are caused by infections for 
whichno vaccine is available (for example, more than 0.5 million deaths 
globally in children under 5 years from enteric infections for which there 
isno vaccine”). These should be the priorities for vaccine research and 
development, as well as improvements needed for particular vaccines 
suchas those against rotavirus, pertussis, polio and yellow fever. Innova- 
tions in delivery devices are also important (for example, micropatches, 
temperature-stable vaccines, improved cold-chain equipment). 


The equity imperative 

Equity has beena primary goal ofimmunization programmes. To reach 
those who are in greatest need means addressing issues of vaccine 
availability, affordability, accessibility, acceptability and financing. An 
effective immunization system that delivers vaccines with high equity 


across social and ethnic strata, maternal and community education, 
and geographies, is a purpose-built programme to deliver impact, and 
has been shown to be the crucial programmatic target. 

Country-level vaccine coverage values mask subnational inequity, 
risking disease outbreaks and backsliding on achievements of vaccina- 
tion. Immunization improvements should focus at the subnational level, 
as well as on other determinants of inequity, not all of which would be 
addressed by focused supplementary vaccine campaigns. 

There is a special case for vaccine development for pathogens that 
cause epidemics. These diseases have little to no market incentive to 
drive product development, hence the need for innovative arrange- 
ments suchas the CEPI’, US Biomedical Advanced Research and Devel- 
opment Authority (BARDA)” and the European Innovative Medicines 
Initiative (IMI; https://www.imi.europa.eu)”. 

Humanitarian crises are another increasing impediment to immu- 
nization. The number, size and duration of conflicts, the migration of 
refugees, and natural disasters have all caused major disruptions to 
immunization programmes and resulted in serious disease outbreaks. 
The persisting hurdles to the eradication of polio reveal how political, 
social and conflict situations can disrupt access to populations and risk 
violence targeting vaccinators such as in Pakistan and Afghanistan”. 
Nearly 100 polio vaccinators and their security guards have been tar- 
geted and killed while attempting to reach children for vaccination”. 


The growing challenge of vaccine confidence 


Despite thesuccess and wide acceptance of theimportance ofimmuniza- 
tion, there are growing groups of people who delay or refuse vaccines. In 
2013, the WHO Strategic Advisory Group of Experts (SAGE) established a 
working group to investigate the scope and scale of vaccine hesitancy”, 
the US National Vaccine Advisory Committee (NVAC) put together a Vac- 
cine Confidence Working Group to investigate the situation inthe United 
States (National Vaccine Advisory Committee, 2015), andthe European 
Centre for Disease Prevention and Control (ECDC) published a review 
of the state of vaccine hesitancy in Europe”. InJanuary 2019, the WHO 
named vaccine hesitancy as one of the top ten global health threats. 

Since 2015, the Vaccine Confidence Index (VCI) has surveyed more 
than 300,000 respondents globally to detect early signals of waning 
public confidence in vaccine importance, safety and effectiveness, 
to prompt early intervention where needed (see Fig. 2 for world map 
of confidence in vaccine safety in 2018). The European Commission 
adopted the VClas part of an effort in 2018 to strengthen cooperation 
against vaccine-preventable diseases”, and the Wellcome Trust used 
the VClas part of their 144-country study into public confidence in vac- 
cines (Wellcome Global Monitor 2018)”. Safety was identified as a key 
issue in both the 2018 European study and the Wellcome report, with 
public confidence in vaccine safety being consistently lower than the 
confidence in vaccine effectiveness and importance”. 

Although a lack of familiarity by both physicians and parents with 
many childhood diseases because of years of successful vaccination 
programmes may have a role ina lack of interest in vaccines, the rea- 
sons for a decline in vaccine confidence are far more complex. Newer 
challenges to vaccine confidence include social media campaigns that 
have disrupted MMR vaccination efforts in southern India, collapsed 
HPV vaccination efforts in Japan, provoked false scares of vaccine 
poisoning in Pakistan, and undermined vaccination programmes in 
Indonesia. 

Vaccine confidence issues are highly varied by setting and vaccine. 
Inathree-year review (2015-2017) of the WHO/UNICEF Joint Reporting 
Form (JRF) completed annually by national immunization programmes, 
over 90% of the 194 countries reported that they experienced vaccine 
hesitancy. The top three reasons for hesitancy were: (1) ‘risk-benefit 
(scientific evidence)’—that is, safety concerns; (2) lack of knowledge 
on the benefits of immunization; and (3) religion, culture and socio- 
economic issues’. 
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Fig. 2 | Global confidence in vaccine safety in 2018. Levels of confidencein 
vaccine safety varied considerably across countries and regions, with several 
countries showing very low levels of confidence. The colour chart at the bottom 


Challenges around building confidence in vaccine safety are well 
beyond communication, although more accessible public communica- 
tion around the complex issues of safety and risk benefit analysis are 
important. What needs to be addressed is not only better communica- 
tion around the known, albeit sometimes misinterpreted, risks and 
benefits of vaccination, but also investing in more research inthe areasin 
which the public is asking questions and the science is incomplete. Find- 
ings that the ASO3-adjuvanted influenza vaccine Pandemrix was linked 
toincreased cases of narcolepsy in Europe prompted further research, 
but a systematic review concluded that more research is needed™. 

Although uncertainty is the norm in science, the political and social 
worlds of the public have become less tolerant of ambiguity and risk’™. 
New modes of listening to the public, with rapidly evolving technologies 
to monitor social media, can collect emerging safety questions as well 
as detect signals of possible issues that need investigation. Working 
towards better aligned public questions and accessible, evidence-based 
answers should be a goal. The WHO Vaccine Safety Net initiative is an 
important resource and can be further built on to address new ques- 
tions as they emerge, as well as to make new research accessible’™. 

Social and political contexts and the reliability of health services are 
important levers of trust, anda low trust setting will have less tolerance 
for risk than one with high trust. A 2015 study showed that high trust 
inimmunization services clearly correlated with lower rates of vaccine 
hesitancy’. The public’s experience with health services and health 
workers is highly influential in vaccine decision, but both are needed. 
The Wellcome Global Monitor report showed that inJapan, for example, 
despite low trust in vaccines and lowtrust in government, confidence 
in health providers remained high. 

Introducing new vaccines into populations requires adequate time to 
train and prepare front-line health workers and vaccinators to be ready 
to manage public questions, and continuing dialogue between scientists 
andthe public will be important to build confidence fromthe start, as 
well as to anticipate and manage adverse events. 

As mentioned above, reported risks of a recently introduced dengue 
vaccine’® in the Philippines amplified into public outrage mediated 
through Facebook pages, and were made more complex because the 
events occurred during political elections. The result was a marked drop 
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shows increasing levels of confidence. Note that the question asked in the survey 
was ‘Do youagree with the following statement: vaccines are safe?’. Source: ref.””. 
Map credit: Alexandre De Figueiredo, The Vaccine Confidence Project. 


in public confidence in vaccines more generally from 99.5% in 2015 to 
76.2% in 2018, and confidence in vaccine safety plummeted from 99.5% 
to 65.2% (Fig. 3). The overall drop in public trust affected willingness 
to accept even the measles vaccine, prompting measles outbreaks with 
more than 25,000 measles cases and 355 deaths by March 2019" and 
requiring considerable efforts to rebuild public confidence and increase 
vaccine uptake. 

Conflict situations also affect confidence in vaccines and vaccina- 
tors owing to an environment of distrust and uncertainty, such as in 
Pakistan and Afghanistan, and inthe Democratic Republic of the Congo, 
where local violence and conflict in the Ebola-affected areas has been 
an obstacle to vaccination efforts. 


The future of immunization 


The contribution of immunization to human health, security and 
prosperity has been matched by few other activities in health and 
development, and has been crucial for progress in child survival. As 
immunization coverage among adults is generally low, it is another area 
in which greater advances can be made. 

Addressing the following issues will be crucial to ensure that the effect 
of vaccination is optimized. 

(1) Leadership and funding. Achieving immunization for all those 
inneed should bea top priority for every country. This will require 
stronger political leadership and a continuing increase in investments 
in immunization, both domestically and internationally’. The power 
of immunization to achieve wider health and societal benefits should 
be further documented. The prioritization of vaccines is particularly 
crucial for middle-income countries that no longer benefit from sup- 
port from the Gavi Alliance and for countries that are transitioning out 
of Gavi support. 

A successful replenishment of Gavi resources in 2020 for the pro- 
posed Gavi 5.0 strategy’” is vital for the next decade of progress in 
child survival, and will be atest of the commitment of the international 
community to immunization and global health. 

(2) Universal vaccine coverage and equity. Overcoming the stagnation 
inreaching all peoplein need with even the basic vaccines is an overriding 
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Fig. 3 | Changing levels of vaccine confidence in the Philippines between 
2015 and 2019. A marked reduction in public confidence in vaccine safety in the 
Philippines was partly due to the impact of social media and local politics. 
Source: The Vaccine Confidence Project®, data collected by the Gallup 
International Association (PSRC). 


priority in all countries, especially in those with the lowest coverage 
and the greatest number of unvaccinated children. As we look towards 
the next decade, ensuring that vulnerable people in all countries are 
not left behind should be atop concern, particularly in middle-income 
countries, as there will be more poor people living there than in poorer 
countries”. 

Ensuringa sustainable and affordable supply of quality vaccines, with 
differential pricing according to the wealth of acountry, is fundamental to 
achieving sustainability and equity ofimmunization. Only a few multina- 
tional companies are producing vaccines, anda growing number of middle- 
income manufacturers are major suppliers. There isa risk that continuous 
lowering of prices may lead to new monopolies, and possibly to higher 
prices. Healthy vaccine markets with sustainable supply are animportant 
objective for vaccine programmes. Harmonization and strengthening of 
regulatory capabilities of low- and middle-income countries are essen- 
tial. Initiatives such as the AVAREF*’ deserve support. The fact that some 
countries require local clinical trials despite WHO pre-qualification can 
bea source of major delays in the introduction of vaccines. 

(3) People-centred programmes. Immunization programmes can 
become more effective with a systems-driven and ‘precision public 
health’ approach, taking into account local variation in immunization 
levels, specificneeds, cultural specifics, and circumstances of vulnerable 
populations. Quality data at administrative levels closer to communities 
should be collected toinform ‘micro-planning’ and adaptive programme 
delivery. Innovative efforts such as thoughtful integration of immuniza- 
tion into health services, education systems and elderly care are needed. 

As most vaccines have incomplete efficacy, tailored approaches to 
optimize their impact will be needed, particularly for vaccines against 
malaria, influenza, dengue and probably HIV when it becomes available. 

(4) Vaccine confidence. Vaccine confidence needs to be addressed 
up front and be an integral part of immunization programmes. Many 
approaches to increasing vaccine uptake do not take into account the 
social, historical and political realities of the public for whom informa- 
tion alone is not the antidote to vaccine reluctance. Instead of older 
demand-creation models, a new model and language of engaging 
with the public is needed, starting with better listening and prompt 
responding to concerns as well as building on local capacities. Inclusion 
of non-traditional partners, new modes of digital communication, social 
scientists, and religious and traditional leaders have been invaluable in 
addressing hesitancy around polio vaccination, and the engagement of 
teenage girls in co-designing social media outreach to address HPV vac- 
cination concerns had positive effects on vaccine uptake in Denmark. 
With safety anxieties being reported as one of the top reasons for vac- 
cine hesitancy, aligning vaccine safety research with dominant safety 
concerns will also be important for confidence building. 


(5) Investment in research and innovation. Many issues mentioned in 
the other recommendations require further research ina wide range of 
disciplines. Product innovation as a result of the formidable progress 
inimmunology and infection pathogenesis has beena strong driver of 
immunization programmes. Thereis reluctance of industry to develop 
vaccines when market incentives are limited, and licensing is uncertain. 
Although companies such as Merck and Johnson & Johnson invested 
considerably in the development of candidate Ebola vaccines, partly 
supported by public funds in North America and Europe, but without 
a prospect of areturn on investment, it would be unrealistic to expect 
that industry will follow this example for each new emerging pathogen. 
There is a major role for the public sector and philanthropy to sup- 
port mechanisms suchas the CEPI to develop vaccines for low-income 
countries’. As discussed under the ‘second hurdle’ on the challenge 
to fund and conduct late clinical development through to the market 
introduction for vaccines for which there is no market incentive, there 
is an urgent need to address this gap, possibly via a specific global ini- 
tiative or at least aconcerted action of several funders. There is alsoa 
need for innovation in trial design (for faster trials with smaller sample 
sizes, and including collection of valuable biosamples to inform cor- 
relates of protection) and in trial analysis, as well as in vaccine delivery. 
Escalating antimicrobial resistance is a powerful incentive to develop 
vaccines against bacterial infections, malaria, tuberculosis and HIV 
infection’°> “°. Innovation in the delivery of vaccination programmes 
is as important as product innovation. 

The world cannot afford to turn the clock back on immunization, 
and ever more innovative vaccines will offer additional opportunities 
to reduce mortality and improve the quality of life for every person 
onthe planet. This will require the best of science, entrepreneurship, 
programme implementation on the ground, and politics. 
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With rapidly changing ecology, urbanization, climate change, increased travel and 


fragile public health systems, epidemics will become more frequent, more complex 
and harder to prevent and contain. Here we argue that our concept of epidemics must 
evolve from crisis response during discrete outbreaks to an integrated cycle of 
preparation, response and recovery. This is an opportunity to combine knowledge 
and skills from all over the world—especially at-risk and affected communities. Many 
disciplines need to be integrated, including not only epidemiology but also social 
sciences, research and development, diplomacy, logistics and crisis management. 
This requires anew approach to training tomorrow’s leaders in epidemic prevention 


and response. 
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When Nature published its first issue in 1869", a new understanding of 
infectious diseases was taking shape. The work of William Farr’, Ignaz 
Semmelweis’, Louis-René Villermé* and others had been published; 
John Snow had traced the source of a cholera epidemic in London*® 
(although Robert Koch had not yet isolated the bacterium that caused 
it°). The science of epidemiology has described patterns of disease in 
human populations, investigated the causes of those diseases, evalu- 
ated attempts to control them’ and has been the foundation for public 
health responses to epidemic infections for over 100 years. Despite 
great technological progress and expansion of the field, the theories 
and practices of infectious disease epidemiology are struggling to 
keep pace with the transitional nature of epidemics inthe twenty-first 
century and the breadth of skills needed to respond to them. 
Epidemiological transition theory has focused mostly on the effects 
of demographic and socioeconomic transitions on well-known prevent- 
able infections anda shift from infectious diseases to non-communica- 
ble diseases®. However, it has become clear that current demographic 
transitions—driven by population growth, rapid urbanization, defor- 
estation, globalization of travel and trade, climate change and political 
instability—also have fundamental effects on the dynamics of infec- 
tious diseases that are more difficult to predict. The vulnerability of 
populations to outbreaks of zoonotic diseases such as Ebola, Middle 
East respiratory syndrome (MERS) and Nipah has increased, the rise 
and spread of drug-resistant infections, marked shifts in the ecology 
of known vectors (for example, the expanding range of Aedes mos- 
quitoes) and massive amplification of transmission through globally 


connected, high-density urban areas (particularly relevant to Ebola, 
dengue, influenza and severe acute respiratory syndrome-related coro- 
navirus SARS-CoV). These factors and effects combine and interact, 
fuelling more-complex epidemics. 

Although rare compared to those diseases that cause the majority of 
the burden on population health, the nature of such epidemics disrupts 
health systems, amplifies mistrust among communities and creates 
high and long-lasting socioeconomic effects, especially in low- and 
middle-income countries. Their increasing frequency demands atten- 
tion. As the Executive Director of the Health Emergencies Program at 
the World Health Organization (WHO) has said: “We are entering avery 
new phase of high-impact epidemics... This is a new normal, I don’t 
expect the frequency of these events to reduce.””. 

We have to act now but act differently: a broader foundation is 
required, enhancing traditional epidemiology and public health 
responses with knowledge and skills from a number of areas (Table 1). 
Many of these areas have long been associated with epidemic prepared- 
ness and response, but they must now stop being seenas esoteric ‘nice 
things to have’, and instead become fully integrated into the critical 
planning and response to epidemics. 

This will require considerable changes by the global public health 
community in the way that we respond to epidemics today and how 
we prepare for and seek to prevent those of tomorrow. It will mean 
reshaping the global health architecture of the response to epidemics 
and transforming how we train new generations of researchers and 
practitioners for the epidemics of the future”. 

The modern research culture—often shaped by the behaviour of 
funders—has required many researchers to specialize in narrow fields, 
with less emphasis on translation than on field-specific innovations. 
Although this siloed landscape has brought major advances in global 
health, itis not fit for the transitional phase of epidemic diseases: rapidly 
evolving, high-impact events bring together communities, responders 
and researchers who do not routinely interact. Different assumptions, 
cultures and practices, each of which may be widely accepted withina 
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Table 1| Selected key areas to integrate into twenty-first 
century epidemic responses 


Area Key areas and/or disciplines 


Governance and Local, national and international organizations; integrate 


infrastructure accountability and transparency across multiple 
stakeholders; improve data sharing, improve logistics 
and crisis management 

Engagementand Encourage acommunity-led response, community 


communication engagement and health diplomacy 


Social sciences Anthropology, political science, human geography, 


linguistics 
Ethics Consent, clinical trial designs 
Emerging Pathogen genomics, metagenomics, systems serology 


technologies and analytics, data science and artificial intelligence 


Research and 
development 


One Health 


Diagnostics, therapeutics and vaccines 


Ecology and environmental, veterinary and agricultural 
sciences 


particular community, make working together in outbreak situations 
more challenging. Fundamental to success is respect and understand- 
ing of the contribution each party brings. In a successfully integrated 
approach, we each have to realize that our knowledge and skills area 
small part of a rapidly expanding toolkit (Box 1). We need to under- 
stand major trends in research and how and when they may influence 
the response to an epidemic, develop new research to strengthen the 
support that we can provide across other areas and learn to operatein 
multi-stakeholder situations—including, at times, as part of a critical 
debate to bring better practices to the fore. 

Central to this approach must be the communities who are at risk 
and those affected by epidemics: local people are the first responders 
to any outbreak and their involvement in the preparation and response 
activities is essential. From communities, through local and regional 
health authorities, national public health institutes and international 
organizations—including many essential partners in sectors beyond 
public health—the integrated approach must be supported. The WHO, 
in particular, has a critical part to play, using its unique mandate not to 
lead every aspect of preparation, response and recovery, but to change 
its practices, facilitate integration with and among others, and ensure 
accountabilities are built in from the bottom to the top. 


Nineteenth and twentieth century epidemiology 


Awave of cholera epidemics across Europe in the 1830s and 1840s cata- 
lysed a newera of ‘infectious disease diplomacy” globally. Nations 
recognized that infections do not stop at borders and that therefore 
multilateral collaboration is essential to protecting citizens from lethal 
epidemics. The development of germ theory through the second half 
of the nineteenth century” transformed ideas about the causes of 
infections, informing scientific research as well as clinical responses. 
Scientific understanding translated into vaccines” and antibiotics, 
while programmes for child health, hygiene, clean water and sanita- 
tion became common in the twentieth century. As a result, childhood 
diseases suchas measles and mumps became rare, smallpox was even- 
tually eradicated" and polio was eliminated from all but a handful of 
countries». Many people thought that infectious diseases would soon 
be history. Sir Frank Macfarlane Burnet is often cited for his remark inthe 
1970s that, with the emergence of new diseases being a distant prospect, 
“the future of infectious diseases will be very dull”. 

Although the focus in high-income nations turned to non-commu- 
nicable diseases, which constituted a considerable and increasing 
burden on the health of their citizens, infectious diseases did not 
disappear. Some endemic infections suchas malaria and tuberculosis 


Box 1 


Non-traditional tools for 
epidemics 


Artificial intelligence 

Advances in computer science and computing speeds have led to 
anumber of applications of artificial intelligence across society™. 
Applications in epidemiology include tracking online searches 
about disease symptoms to aid early detection of epidemics, 
although more sophisticated methods may be required before 
artificial intellegence becomes a reliable detection tool®. 
Crystallography 

Modern X-ray diffraction and electron microscopy can reveal 
structures of viruses and antibodies in such detail that it is possible 
to identify specific sites of vulnerability on the virus. A previous 
study showed how such techniques identified an antibody that was 
much more potent against respiratory syncytial virus than the only 
currently available intervention®. 

Platform vaccine technology 

Developing vaccines for emerging infectious diseases has many 
challenges, including the time it takes, a limited market and strict 
regulatory requirements for products that will be given to healthy 
people’’. Platform technologies use one underlying approach with 
standardized processes and some antigen-specific optimization 

to speed up both development and manufacture of vaccines. 

For example, vector-based platforms combine an antigen, or a 
gene for an antigenic protein or peptide, in a virus-like particle or 
liposome. Such platform technologies have the potential to deliver 
vaccines a few months after an emerging pathogen is identified 
and sequenced, rather than years®. 


were not susceptible to elimination strategies, and new diseases with 
epidemic and pandemic potential emerged. Ebola virus disease was 
first identified in the 1970s, HIV/AIDS in the 1980s, Nipah virus inthe 
1990s, SARS and MERS at the start of the twenty-first century, and 
many more have since been identified. Far from becoming ‘very dull’, 
the field of infectious disease epidemiology has sometimes struggled 
to adapt: as late as 1990, respected researchers used a nineteenth 
century ‘law’ of epidemiology to make predictions about the AIDS 
epidemic—these turned out to be vast underestimates”. Advances in 
other fields gave epidemiology the chance to evolve. In 2001, when 
the editors of the /nternational Journal of Epidemiology provocatively 
asked whether it was time to ‘call it a day’ given the putative power 
of genomics to explain diseases over the capacity of epidemiologists 
to describe them, their conclusion was that it had the potential to 
positively transform epidemiology as muchas the rise of germ theory 
acentury earlier. 


The newnormal 


At least 150 pathogens that affect humans have been identified as 
emerging, re-emerging or evolving since the 1980s”, while increasing 
rates of antimicrobial resistance threaten to make formerly controlled 
infections, suchas malaria, untreatable*°—this also limits our ability to 
control their epidemic potential. The demographic transition is driving 
much of this: human society is becoming more urban than rural for the 
first time in our history, bringing large numbers of people (and often 
animals) together in densely populated areas”. Agricultural and forestry 
practices are changing the relationships between people, animals and 
our respective habitats”. Travel is more accessible around the world, 
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so migration, trade and tourism bring more people into contact and 
thus affect disease transmission”. Climate change has many effects 
on ecosystems and environments, not least in changing the habitats 
and migratory habits of disease vectors”. States with weak health sys- 
tems are far less likely to cope with or recover from multiple emergent 
demands without damaging routine services”. Inequalities”, inequities 
and distrust in national structures and institutions compound people’s 
vulnerabilities””. Conflict increases the risk of epidemics and makes 
responding to them close to impossible”®. 

Since 2000, there have been several outbreaks of Ebola (including 
the two biggest in history), not to mention outbreaks of SARS, MERS, 
Nipah, influenza A subtype HSN1, yellow fever, Zika and the continued 
spread of dengue. Epidemics overlap and run into each other, yet the 
worldis not currently equipped to cope with this increasing burden of 
multiple public health emergencies. Preparing for epidemics, therefore, 
requires global health, economic and political systems to be integrated 
just as muchas infectious disease epidemiology, translational research 
and development, and community engagement. 


Essential areas in epidemic response 

Governance and infrastructure 

Epidemics represent shared risks that cross borders and all of society. 
Health systems, routine care, trust in governments, travel, trade, busi- 
ness—all are disrupted during an epidemic. With such broad risks, the 
preparation and response must be nationally owned and led, interna- 
tionally supported and undertaken with a whole-of-society approach. 
Someinitiatives have started to build frameworks for this to happenin 
a coordinated way. For example, the WHO’s Pandemic Influenza Pre- 
paredness Framework brings together nation states, industry, other 
stakeholders andthe WHOtoimplementa global approach to pandemic 
preparedness and response”’. 

A focus must be building coordinated regional and country expertise, 
resources and capacity through national and regional public health 
institutions’. This bringsits own challenges—governance of institutions, 
leadership, collaborations and interventions have to be impeccable 
or misconduct can thrive. Unwelcome in itself, misuse of funding, 
resources or people within efforts intended to support an epidemic 
response will also undermine trust in the organizations that respond 
to an outbreak and, in turn, prolong the outbreak. 

Key governance components include drafting policies in advance and 
being willing to implement those policies for data collection and shar- 
ing during epidemics. They must be flexible enough to enable affected 
communities and nations to retain ownership of the response, while 
drawing on international expertise to find the best possible response. 
Governance should also include processes for vaccine and therapeutic 
approvals during outbreaks. However, it is clear that the centre of grav- 
ity for leadership, governance and implementation must be where the 
need is greatest if these are to truly deliver. 

In 1971, Julian Tudor Hart proposed the inverse care law: “The avail- 
ability of good medical care tends to vary inversely with the need for it 
inthe population served.””. An analogue of the inverse care law can be 
applied to public health and epidemiology. Expertise in these fields has 
traditionally gravitated towards centres of excellence in Europe and the 
United States. Of course, high-income countries are not immune tothe 
disruption associated with epidemics, especially in an era of misinforma- 
tion and growing mistrust in authorities and public health initiatives. 
However, the centre of gravity must shift so that globally representa- 
tive distributed networks of collaborating centres can jointly ensure 
coverage in the regions that urgently need these skills on the ground®. 
International collaborations remainimportant; however, strengthening 
epidemiology, public health and laboratory capacity in low- and middle- 
income countries is essential. Collaborative interventions should not 
be limited to when there is a major outbreak, but be integrated into 
regular interactions. 
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Capacity, resources, expertise and governance can be supported by 
the increasing role for regional and national centres of disease control. 
The US Centers for Disease Control (CDC) lends its expertise all around 
the world in addition to protecting the US population. In 2004, the Euro- 
pean CDC started, followed by the China CDC in 2015 and by the Africa 
CDC in 2017. Although more can be done to improve data sharing and 
access to laboratories, the networks and connections between these 
centres have strengthened all of their work, as well as having a positive 
effect on public health systems in low- and middle-income countries. 


Engagement and communication 

During the pan-European wave of cholera in the 1830s, there were riots 
across the continent: doctors, nurses and pharmacists were murdered, 
hospitals and medical equipment destroyed”. Similar reports today usu- 
allycomefromcommunitiesthat have not had positive prior interactions 
with public health initiatives, and thus the encounter with national or 
international teams who arrive only in response to a ‘new disease means 
that trust can never be assumed and has to be earned on both sides. 
Engagement needs to start before an outbreak—ensuring that patients, 
their families and their communities are at the centre of all public health 
is essential for the successful prevention and response to epidemics. 
There is no public health without the support of the community. 

For example, the early detection of disease events will be improved 
if more national and regional public health institutions establish com- 
munity event-based surveillance systems. Communities are the first 
to know when something unusual happens*®—therefore training and 
mobilizing community volunteers to report such occurrences is a cost- 
effective way to rapidly detect diseases and contain them at the source. 
This will also help to sustain engagement between communities and 
the organizations that respond to outbreaks. Furthermore, improved 
information flow between the community and the public health sys- 
tem should provide a better understanding of local social networks to 
complement other means of tracking chains of transmission between 
individuals and places. This can be the community themselves, or it 
might be veterinarians who see clusters of sick animals, or nurses and 
doctors who care for patients in primary care—or it may be teams that 
are often forgotten in public health initiatives, such as those working 
incritical care facilities; it is striking how the first cases of Nipah, SARS, 
MERS and influenza A subtype HSN1 were all first identified by clinical 
teams in critical care facilities. 

An inclusive, whole-of-society approach is challenging, and the 
challenges may be magnified in a conflict or post-conflict zone. Wars 
and conflicts not only increase the risk of epidemics as people move 
to escape violence and health services become harder to maintain”®, 
but also make public health responses vulnerable to interruption, thus 
making them less effective. Then, miscommunication, mistrust, dis- 
ease and violence can fuel each other ina vicious cycle. Engaging local 
communities remains the highest priority, even in unstable contexts 
such as North Kivu and Ituri provinces of the Democratic Republic of 
the Congo (DRC)”, where an Ebola epidemic started in August 2018. 
It seems inevitable that responding to epidemics in politically unsta- 
ble environments will become more common, and skilled negotiators 
and peacekeepers will have to be better integrated in response teams. 
Equally essential, therefore, will be animproved understanding of these 
challenging operational contexts among affected communities and 
external responders alike. 


Social sciences 

Social scientists have long applied their skills and knowledge in epi- 
demic responses, although their roles have become more visible in 
recent years**. By focusing on communities, social science human- 
izes the epidemic response”, helps to increase understanding of 
context and may uncover associations between the context or local 
practices and the risk of transmission. The Social Science in Humani- 
tarian Action Platform” has successfully produced rapid reports and 


Box 2 


Precision public health 


Precision medicine refers to the use of genomic sequencing to 
retrace the specific course of a disease in individual patients, with 
the aim of being able to choose the best treatment option for each 
person. In public health, the analogous idea of precisely directing 
the right intervention to the right population is equally appealing. 

The potential of such an approach has been illustrated by the 
identification of two areas in the United States in 2016 that were 
at risk of Zika transmission®’. Rather than the whole country, or 
even only Florida, being declared at risk, these two areas each 
measured less than 5 km’, and the response focused only on these 
specific neighbourhoods. By contrast, a campaign against yellow 
fever, also in 2016, defined risk ‘at the level of entire nations’. 

A broad interpretation of precision public health®° 
incorporates many different types of data to increase the power 
of epidemiology”. Such data would not only include genomic 
information, but also satellite imaging, mobile phone data, social 
media use data and so on. For example, a study published in 2019 
combined epidemiological surveillance data, travel surveys, 
parasite genetics and anonymized mobile phone data to measure 
the spread of malaria parasites in southeast Bangladesh”. A 
retrospective analysis of mobile phone call data in Sierra Leone 
from 2015 showed how it might have been used to assess 
the impact of travel restrictions on mobility during the Ebola 
epidemic*®. 

The principle of selecting the most relevant information from 
all available data seems within the scope of good epidemiological 
practice already. The challenge is recognizing and incorporating 
new types of data when they become available. 


briefings on regions in which an epidemic has been identified, andthe 
Global Research Collaboration for Infectious Disease Preparedness 
includes a social science research funders’ forum to ‘propel research 
inthis area”, acknowledging that its integration in the preparation and 
response to outbreaks is often missing or added as an afterthought to 
solve a problem that could have been forseen. There is still much to 
learn about how epidemic responders and social scientists can make 
the most of each other’s expertise” and how data from social science 
can fit into the wider information architecture of epidemic response. 

Asanexample, behavioural surveillance* will be critical intwenty-first 
century responses to disease outbreaks“. Just as behavioural surveil- 
lance to improve the understanding of HIV was crucial in identifying 
high-risk groups for HIV infection, so human behaviours will continue to 
beimportant as we respond to future infectious diseases. For instance, 
the Ebola virus outbreak in West Africa probably began before Decem- 
ber 2013, but it took several months before hospital transmission and 
traditional burial practices were found to be the leading causes of its 
rapid spread. 


Emerging technologies 
The increasing prevalence of mobile phones, wireless internet connec- 
tivity and social media activity raises the possibility of using these tools 
to gather data for epidemiological studies, diagnostics*, population 
mobility during an Ebola epidemic” or influenza incidence in real time”. 
Future developments in predictive technology, machine learning and 
artificial intelligence will bring more opportunities to move towards 
‘precision public health’ (Box 2). 

The use of data from people is becoming strictly controlled, how- 
ever, and it will be a challenge to persuade countries to invest inanew 


surveillance system, for example, before its general effectiveness has 
been demonstrated at a country level*’. Even then, technology-based 
solutions should be integrated with community-based programmes and 
other existing epidemic preparedness and response systems because 
surveillance is more effective when standardized among different coun- 
tries, districts and communities. To this end, suites of guidance and 
open-access standardized tools are being developed for reporting cases 
of disease, as well as consent forms, standard operating procedures and 
training materials’, properly validated diagnostic assays and access to 
quality-assurance panels in public® and veterinary” health. The rising 
trend of engaging citizens in data gathering is also welcome—the use of 
mosquito-recognition apps enables the collection of data far beyond 
the capacity of routine mosquito surveillance”. This way, citizens feed 
information into the public health system and the feedback loop offers 
afast and direct way to provide citizens with details of potential actions 
that they can take. 

As well as potentially supporting diagnosis and surveillance, the 
fast-developing field of genomic epidemiology™ can yield information 
to track the evolution of a virus such as Ebola during an epidemic>”®. 
There will be times when it can detect outbreaks better than traditional 
epidemiology, illustrating the need to have these tools available inthe 
same toolbox. During the large Lassa fever outbreak in Nigeria in 2018, 
real-time genomic sequencing provided clear evidence that the rapid 
increase was not due toa single Lassa virus variant, nor attributable to 
sustained human-to-human transmission. Rather, the outbreak was 
characterized by vast viral diversity defined by geography, with major 
rivers acting as barriers to migration of the rodent reservoir’. These 
findings were crucial in containing the outbreak. 

Developing and sustaining the capacity to conduct real-time sequenc- 
ing with adequate bioinformatics analyses at regional and national levels 
will be challenging in low- and middle-income countries. Moreover, 
investments in relatively high-tech capacity (suchas real-time sequenc- 
ing) are competing with other, arguably more fundamental needs, such 
as equipment and training in primary laboratories. Political engage- 
ment must be nurtured between epidemics: it is not enough to offer 
technological and laboratory support during a crisis, even with the 
promise of building capacity, if the political willis not there. However, 
with proper preparation, and accessible and trusted data sharing and 
governance mechanisms, laboratories with limited resources may be 
able to leap-frog into the twenty-first century. 


Research and development 
Vaccination is one of the most effective public health interventions 
and innovative strategies for research and development of vaccines, 
suchas using ring vaccination as atrial design during Ebola epidemics 
since 2015 ©, must be encouraged. At the start of the 2013-2015 epi- 
demic in West Africa, vaccine candidates were already in development, 
based ona long history of preclinical research, although a lot of work 
was still required to get clinical trials underway in time to be useful™. 
In 2015, when Zika was first internationally recognized as a pathogen 
that could cause birth defects™, there was hardly any research and no 
vaccines in late-stage development. Two-and-a-half years later, results 
fromthree phasel clinical trials had been reported®, although challenges 
remained for further development. The lack of a profitable market for 
such products means that pharmaceutical companies lack the incentives 
to push this work between epidemics. Initiatives such as the Coalition 
for Epidemic Preparedness Innovations are attempting to positively 
disrupt financing models for vaccines against epidemic diseases, and 
stockpiles of meningococcal vaccine, yellow fever vaccine and oral 
cholera vaccine are maintained by the International Coordinating Group 
to minimize potential delays due to limited manufacturing capacity”. 
Similarly, ifinvestigational treatments or vaccines are to be used as 
part of the response to an epidemic, ethical protocols® for managing 
informed consent and introducing them in clinical settings must be 
planned in advance with at-risk communities (Box 3). Trial designs? 
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Box 3 


Epidemic ethics 


In 2016, the PREVENT project received Wellcome funding to 
provide ethics guidance “at the intersection of pregnancy, 
vaccines, and emerging and re-emerging epidemic threats”*°. This 
was in response to the newly recognized association between 
infection with Zika virus during pregnancy and microcephaly in the 
newborn. Developing a vaccine was an obvious route to explore, 
but many researchers felt that they could not conduct clinical 
trials with pregnant women because it is generally assumed that 
the risk to the woman, the fetus or both outweighs any potential 
benefit. However, as Heyrana et al. argue: “Preventing pregnant 
women from participating in clinical trials is well intentioned but 
misguided.”. 

PREVENT rapidly developed guidance for including pregnant 
women and their babies in Zika vaccine research®, and has since 
extended their scope to “a roadmap for the ethically responsible, 
socially just, and respectful inclusion of the interests of pregnant 
women in the development and deployment of vaccines against 
emerging pathogens.”*. 

Integrating ethics in the preparation and response to epidemics 
does not close off avenues of research; it opens up possibilities 
and expedites progress. 


should be created as soon as the option becomes viable. The essen- 
tial consideration is how the resulting data can add to previous trials 
and influence the approach to trials in future epidemics. For example, 
research during the 2013-2015 Ebola epidemic enabled progress on 
therapeutic agents” that are now being trialled in the ongoing outbreak 
in DRC”. Scientific progress during and between epidemics must be 
matched by other workstreams, suchas the preparation of supply chain 
logistics and communication with at-risk populations. Plans have to be 
made for a series of future outbreaks, enabling adaptive, multi-year, 
multi-country studies”. Similar plans are needed for continual preclini- 
cal research to ensure that future vaccine and therapeutic pipelines 
will be filled. 


One Health 
The term ‘One Health” is used to acknowledge that human, animal and 
ecosystem health are tightly interconnected and need to be studied in 
the context of each other (Fig. 1). Changes in the environment—whether 
natural or anthropogenic—affect interactions between pathogens, vec- 
tors and hosts in multiple and complex ways, making the emergence 
or decline of endemic, epidemic and zoonotic diseases difficult to pre- 
dict, while epidemics of animal diseases can challenge a community’s 
access to food. The fact that pools of viruses, bacteria and parasites are 
maintained in wild and domesticated animals” makes surveillance of 
potentially zoonotic diseases an intrinsic part of One Health epidemic 
planning. Many agencies and nations around the world nowuse prior- 
itization tools such as those developed by the US CDC” or the United 
Nations (UN) Food and Agriculture Organization (FAO)” to identify and 
prioritize zoonotic diseases of concern. An early precedent was ajoint 
consultation on emerging zoonotic diseases by the WHO, the FAO and 
the World Organisation for Animal Health in 2004”. Understanding 
disease ecology in the zoonotic reservoir could potentially lead to ways 
to predict the risk of human disease, thus providing the basis for smart 
early-warning surveillance systems. 

Individual countries with limited resources for epidemiological 
studies and epidemic preparation and response must decide their 
own priorities. However, infectious diseases do not respect borders. 
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Fig. 1| An ecosystem of interactions. The tightly interconnected nature of 
human, animal and environmental health makes the emergence and decline of 
epidemics difficult to predict. One Healthintegrates multiple perspectivesina 
framework that emphasizes the need to consider any particular aspect in this 
broader context. 


Similarly, the interdisciplinary nature of One Health means there are 
several different lenses through which different sectors assess risks 
and priorities. For One Health approaches to work, these multiple per- 
spectives must be taken into account, whether human health or animal 
health, ecology or social sciences”. 


Recovery 


Epidemics do more than cause death and debilitation: they increase 
pressure on healthcare systems and healthcare workers and draw 
resources from services not directly linked to the epidemic. This can 
leave alegacy of distrust between people, governments and health sys- 
tems, although more-positive outcomes have been found to strengthen 
relations between communities and public authorities. The full social 
and economic costs of the Ebola outbreak in West Africa have been 
estimated” to be as high as US$53 billion when including the effect on 
health workers, long-term conditions suffered by 17,000 Ebola survi- 
vors, and costs of treatment, infection control, screening and deploy- 
ment of personnel beyond West Africa. As healthcare resources became 
increasingly allocated to the Ebola response, hospital admissions fell 
and deaths from other diseases rose markedly, adding US$18.8 billion 
to the estimated cost. Such pressure can be withstood in high-income 
countries with strong health systems, but inlow-income countries the 
pressure can quickly reach a breaking point. 

Ebola killed almost 1.5% of doctors, nurses and midwives in Guinea, 
6.85% in Sierra Leone and just over 8% in Liberia®°. This is compared to 
mortality between 0.02% and 0.11% of the whole population of these 
countries. Estimates of the effect of this loss on maternal mortality 
suggest that thousands more women may have died in childbirth each 
year since the epidemic ended. Beyond the tragic deaths of so many 
healthcare workers, people were less likely to use health services for 
children or adults during the epidemic, suggesting decreased trust or 
even fear of healthcare settings®. More recently, insome areas affected 
by the 2018 Ebola outbreak in DRC, the introduction of free non-Ebola 
healthcare led to unprecedented demand. However, healthcare facilities 


were not given sufficient additional resources to care for the number of 
people, which may have contributed to nosocomial infections. 
Survivors, too, need to be cared for long after the epidemic is declared 
over. Acohort of more than 3,000 children is growing up in Brazil after 
being born with microcephaly because their mothers were infected with 
Zika during pregnancy. Tracking the development of these children 
increases understanding of the effects of Zika infection and helps to 
define what medical and social support the affected families may need 
as many of the children will grow up with severe developmental delays. 


Outlook 


The challenges posed by twenty-first century epidemics are real and 
changing: future epidemics will be fuelled by conflict, poverty, climate 
change, urbanization and the broader demographic transition. In our 
response we must consider epidemics not as discrete events, but rather 
as connected cycles for which we can prepare, even if we cannot predict 
specific outbreaks. The challenge is then to choose the right response 
at the right scale in the right area at the right time. There needs to bea 
greater emphasis on absorbing and using positive lessons from each 
episode and avoiding those that led to negative outcomes™. 

The way that we train practitioners and researchers working in 
all fields relevant to today’s epidemic landscape has to change. A 
modern approach that is capable of characterizing epidemics and 
the best ways to control them must go beyond a narrow definition of 
epidemiology that sustains artificial barriers between disciplines. 
Instead, it must be able to integrate tools and practices froma diverse 
range of established and emerging scientific, humanistic, political, 
diplomatic and security fields. We believe that such an approach needs 
to become the norm for the curriculums of schools of public health 
around the world. 

As well as training new generations of epidemiologists so that they 
have the skills, knowledge and networks to recognize and make use 
of every tool available to help them to do their work effectively, the 
entire architecture of the response to epidemics has to be adapted. 
Only then will we be able to maintain the comprehensive and effective 
response—including prevention and research—needed to stop epidem- 
ics and protect people’s lives, no matter what the circumstances. 
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The goal of sex and gender analysis is to promote rigorous, reproducible and 


responsible science. Incorporating sex and gender analysis into experimental design 
has enabled advancements across many disciplines, such as improved treatment of 
heart disease and insights into the societal impact of algorithmic bias. Here we discuss 
the potential for sex and gender analysis to foster scientific discovery, improve 
experimental efficiency and enable social equality. We provide a roadmap for sex and 
gender analysis across scientific disciplines and call on researchers, funding agencies, 
peer-reviewed journals and universities to coordinate efforts to implement robust 
methods of sex and gender analysis. 
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Integrating sex and gender analysis into the design of research, where 
relevant, can lead to discovery and improved research methodology. 
A deeper understanding of the genetic and hormone-mediated basis 
for sex differences in immunity, for example, promises insights into 
novel cancer immunotherapies'. Evidence that facial recognition sys- 
tems misclassify gender more often for darker-skinned women than 
for lighter-skinned men has led to refinements in computer vision?. 
Understanding sex-based responses to climate change allows better 
modelling of demographic change among marine organisms and the 
downstream effects for humans**. Sex or gender analysis can be critical 
tothe interpretation, validation, reproducibility and generalizability of 
research findings (Box 1). 

The documented importance of sex and gender analysis in research 
has underwritten policy change at major funding agencies. New policies 
have been implemented at the Canadian Institutes of Health Research 
(2010), European Commission (2014), US National Institutes of Health 
(2016), German Research Foundation (2020), among others. Concur- 
rently, peer-review journals have implemented editorial guidelines to 
evaluate the rigour of sex and gender analysis as one criterion among 
many whenselecting manuscripts for publication. The goal is to increase 
transparency, promote inclusion and reset the research default to care- 
fully consider sex and gender, where appropriate. 

Inthis Perspective, we discuss how incorporating sex and/or gender 
analysis into researchcanimprove reproducibility and experimental effi- 
ciency, help to reduce bias, enable social equality in scientific outcomes 
and foster opportunities for discovery andinnovation. From highlighted 
examples, we extract decision-tree roadmaps for researchers across 


disciplines. We consider the limits to sex and gender analysis and offer 
recommendations to researchers and funding agencies on howto move 
the field forward. Throughout this Perspective, we explore howintegrat- 
ing sex and gender analysis into research design has the potential to 
offer new perspectives, pose new questions and, importantly, enhance 
social equalities by ensuring that research findings are applicable across 
the whole of society. 


Reproducibility and efficiency 


Reproducibility is important for scientific excellence. One important 
reason for alack of reproducibility in experimentation is inconsistency 
inmethodological reporting, which varies widely across disciplines from 
biology to chemistry, human-robot interaction, medicine, physics, psy- 
chology and beyond>*. Sex- and gender-specific reporting is still limited 
inarange of scientific disciplines. In preclinical microbiology andimmu- 
nology, a review of published studies using primary cells from diverse 
animal species (that is, humans and nonhuman vertebrates) revealed 
that the majority failed to report the sex of donors from which the cells 
were isolated”®. In marine science, a review of experimental ocean acidi- 
fication studies showed that only 3.9% of studies statistically assessed 
sex-based differences, while only 10.5% of studies accounted for possible 
sex effects by assessing females and males independently’. Similarly, 
in ecotoxicology, a review of omics studies showed that although most 
reported sex, only 23% (5 out of 22) examined the omics response of each 
sex toa toxicant”. In social robotics, the notion of robot gender, gender- 
stereotypical domains and their interaction with user gender has only 
recently becomea target of scientific inquiry”. A lack of transparency 
in reporting sex and gender-related variables makes it difficult to repro- 
duce experiments in which these variables affect experimental results. 


Disaggregating the data 

Analysing experimental results by sex and/or gender is critical for 
improving accuracy and avoiding misinterpretation of data (Fig. 1). The 
common practice of pooling the response of females and males or 


‘Institute of Gender and Health, Canadian Institutes of Health Research, Université de Montréal, Montreal, Quebec, Canada. College of Life and Environmental Sciences, University of Exeter, 
Exeter, UK. °Center of Excellence Cognitive Interaction Technology, Department of Psychology, Universitat Bielefeld, Bielefeld, Germany. “Biomedical Data Science, Stanford University, 
Stanford, CA, USA. °Chan-Zuckerberg Biohub, San Francisco, CA, USA. °History of Science, Gendered Innovations in Science, Health & Medicine, Engineering and Environment, Stanford 
University, Stanford, CA, USA. ’These authors contributed equally: Cara Tannenbaum, Robert P. Ellis, Friederike Eyssel, James Zou. *e-mail: schiebinger@stanford.edu 


Nature | Vol575 | 7November 2019 | 137 


Perspective 


Box 1 


Distinguishing sex and gender 


Sex refers to the biological attributes that distinguish organisms 
as male, female, intersex (ranging from 1:100 to 1:4,500 in 
humans, depending on the criteria used°"”’) and hermaphrodite 
(over 30% of noninsect nonhuman animals”). In biology, sex 
describes differences in sexual characteristics within plants or 
animals that go beyond their reproductive functions to affect 
appearance, physiology or neuroendocrine, behavioural and 
metabolic systems. In engineering, sex includes anthropometric, 
biomechanical and physiological characteristics that may affect 
the design of products, systems and processes. 

Gender refers to psychological, social and cultural factors 
that shape attitudes, behaviours, stereotypes, technologies and 
knowledge. Gender includes three related dimensions. Gender 
norms refer to spoken and unspoken rules in the family, workplace, 
institution or global culture that influence individuals. Gender 
identity refers to how individuals and groups perceive and present 
themselves within specific cultures. Gender relations refer to 
power relations between individuals with different gender roles 
and identities’. 

Sex and gender interact in unexpected ways. Pain, for 
example, exhibits biological sex differences in the physiology of 
signalling. Pain also incorporates sociocultural components in 
how symptoms are reported by women, men and gender-diverse 
people, and how physicians understand and treat pain according 
to a patient’s gender™®. 


women and men can mask sex differences. For example, consider cope- 
pods, small aquatic crustaceans. Failure to disaggregate and analyse 
data by sex leads to the false interpretation that increased levels oF Poo, 

have no significant biological effect on respiration (Fig. 1b). By contrast, 

disaggregating data by sex reveals important sex-based differences in 
the respiration rate of females and males in response to increased Pro, 

levels”. 

The same is true for human research. Pooling data yields inexact 
results. Ina human-robot experiment, humans were asked to touch 
or point to anatomical regions on a59-cm NAO robot. When asked to 
touch accessible regions (suchas hands and feet), there was little physi- 
ological reaction; when asked to touch inaccessible regions (such as 
the plastic buttocks or genitals of the robot), human participants had 
increased heart rate and blood pressure”. Equal numbers of women 
and men were recruited for the experiment; however, the data were not 
disaggregated or analysed separately. We know that norms for human 
social touch vary according to the age, gender identity and cultural 
background of the participant—as well as social context and purpose of 
the touch“. If results are not stratified by these variables, opportunities 
will be missed to provide clearer insights into their influence on human 
judgments and behaviour. 


Variability, sample size and interactions 

Scientists have erroneously assumed that females should be excluded 
from experiments because of the variable nature of the data caused by 
the reproductive cycle”. In fact, research has shown that males exhibit 
equal or greater variability than females for specific traits owing to fluc- 
tations in testosterone levels and other factors, such as animal group 
caging”. Analysis of microarray datasets reveals similar findings that 
females are no more variable than males on measures of gene expression 
in both mice and humans”. Accounting for sex and gender enhances the 


138 | Nature | Vol575 | 7 November 2019 


likelihood of detecting meaningful effects, elucidating unexplained 
variability and potentially reducing the overall number of experiments 
required to determine trends or make ground-breaking discoveries. In 
a meta-analysis of 11 proteomics datasets from humans and mice, sex 
explained 13.5% of the observed variation of complex protein abun- 
dances and stoichiometry, even more than other environmental factors, 
suchas diet’®. 

Onthesurface, it may appear that including females and males, women 
and menina study necessitates doubling the number of experimental 
participants. However, this is not always the case. More efficient experi- 
mental designs canincorporate both sex and gender while maintaining 
control over variance”. Factorial designs, in which two experimental 
factors with multiple levels are tested, and data are collected across all 
possible combinations of factors and levels, are one such strategy. This 
enables the effect of each factor to be tested, in addition to the interac- 
tion between the factor levels. For such cases, sample sizes may need to 
be slightly increased by 14-33% to account for the extra parameter being 
estimated, but they do not need to be doubled, according to sample 
size calculators that consider interaction effects”°”. Analysing data by 
sex or gender enhances the likelihood of detecting meaningful effects 
that, inturn, help to reduce confounding, increase reproducibility and 
reduce the cumulative number of experiments required. 

Numerous interactions, such as the interaction of the sex of the 
research participants, may also influence outcomes. In animal research, 
females and males are often studied separately in the laboratory. Yetin 
the wild, the sexes coexist—and their interactions caninfluence research 
results. Recent studies of longevity in the nematode, Caenorhabditis 
elegans, found that the presence of males accelerated ageing in individu- 
als of the opposite sex (inthis case, hermaphrodites). In other words, her- 
maphrodites died at a younger age in the presence of males. Researchers 
traced this ‘male-induced demise’ to pheromones released by males 
and found it could occur without mating and required only that the 
hermaphrodites be exposed to the medium in which males were once 
present”. Ignoring such interactions potentially leads to anincomplete 
understanding of species viability in the wild. 

Other interactions focus on the sex of the researcher and potential 
impacts on research participants. In social science, it has long been 
understood that the simple presence of an observer can alter the 
response of the observed, whether in the field or in laboratory experi- 
ments”. In quantum mechanics, the act of observation can alter the 
phenomenon by collapsing the wave function. Similarly, in animal 
research, experimenter sex can influence research outcomes. A study 
exploring pain showed that rats and mice did not exhibit pain whena 
male experimenter was present, as opposed to when a female experi- 
menter was present in the room or wheninan empty room. Both female 
and male mice displayed this ‘male observer’ effect, but female mice did 
so toa greater extent. Researchers determined that the mice responded 
to male-associated olfactory stimuli™*. The authors suggest that not 
controlling for experimenter sex throws into question many of the previ- 
ously published studies on pain research. 

Many other examples of these types of interactions—crucial to excel- 
lence and discovery in research—could be discussed. However, here 
we would like to include one further interaction of note, namely of 
researcher gender and the type of research conducted. Two studies 
provide compelling evidence that in biomedical, clinical and public 
health research, women in leading positions (first and last author) are 
more likely to analyse sex and gender in published research”. However, 
this dynamic has not yet been replicated in other research fields, such 
as computer science, engineering or the physical sciences. 


Opportunities for discovery 

Ignoring sex and gender analysis can lead to inaccuracies, research inef- 
ficiency and difficulties generalizing results. Integrating sex and gender 
analysis into research can open the door to discovery and innovation. 
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Fig. 1| Hazards of pooling data from both sexes. Pooling data across sexes not 
only assumes that there is no difference between males and females, but also 
subsequently prevents researchers from testing for the dependency of an 
experimental response on the sex of a study participant. a, The theoretical 
examples reveal that pooling (green circles) masks important male (orange 
triangles) and female (blue squares) differences in baseline data, treatment 


Aprevalent assumption is that sex is a binary trait determined geneti- 
cally before birth, and thatit is fixed across lifespan”’”®. Commonly used 
model organisms in biology, such as mice, Drosophila melanogaster 
and C. elegans, reinforce these perceptions. Sex, however, can be highly 
plastic, and studying interactions with the environment, for example, 
has led to new understandings of the mechanisms of sex determination 
within the context of global climate change. 

The sex ratio ofa population influences its resilience to environmental 
disturbances. The mechanism that determines sex is thus a vital consid- 
eration for predicting population viability”’*°. Enhancing the capacity 
of sex analysis for a growing number of species, across a wide range 
of settings, may increase our ability to accurately model the effects of 
climate change. 


Climate impacts in the ocean 
For species reliant on temperature for sex determination, rapid global 
warming poses a risk to sex ratios and demographic stability. Turtles 
are the most widely studied group in which sex is determined by tem- 
perature. The ability to differentiate between female and male juvenile 
greensea turtles using non-invasive endocrine markers has enabled the 
discovery that global warming negatively skews population sex ratios. 
Turtles originating from warmer northern Great Barrier Reef sites, for 
instance, exhibit a female sex ratio of 99%, whereas cooler southern 
sites maintain a 68% female juvenile ratio’. Similarly, in fish species with 
temperature-dependent sex determination, warming is projected to 
result in male-skewed populations (up to 3:1 male:female) by the end 
of the century”®. Such changes in sex balance can limit mate choice, 
reduce reproductive capacity and undermine population viability”. 
Warming is not occurring in isolation, but against a backdrop of 
anthropogenic disturbances across marine environments, which 
include habitat destruction, pollution and overfishing. Primary sex 
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response and sex x treatment interactions—any one of which leads to 
misinterpretation of the results. b, An example of experimental data in which 
pooling would have masked both the sex difference in the respiration rate of 
copepods, as well as the response of this variable to increased levels Of Poo," 
Theoretical examples were generated using hypothetical data; experimental 
data were taken froma previously published study”. 


differentiation has been shown to respond to a diverse range of these 
environmental factors in a growing number of species. Hypoxia, for 
example, has resulted ina higher ratio of males in zebrafish®. Similarly, 
ocean acidification results in 16% more female oysters over a single 
generational cycle*, and increased aquatic pH results in more female 
cichlids*. What is increasingly apparent is that alterations in sex ratio— 
in either direction—will result in populations that are less resilient to 
further disturbance and potentially lead to demographic collapse®**. 

Social organization can also influence population sex ratios. Numer- 
ous nonhuman species develop elaborate social organizations, and sex 
determination can be socially mediated. Clownfish, for example, are 
protandrous hermaphrodites (they mature as male; some change to 
female) that live ina strict social hierarchy witha single dominant and 
highly fecund female at the top who mates with a single large male in 
the social group; all remaining individuals remain immature juveniles. 
Removal of the alpha female results in the alpha male changing sex 
to female, with all subordinates moving up a rung in the social hier- 
archy”. By contrast, many grouper species, a subfamily of long-lived 
and high-value reef species, are protogynous hermaphrodites (they 
mature as female; some change to male). Large dominant males con- 
trol groups of females with strong sexual selection, resulting in these 
males achieving the greatest reproductive success. These sequentially 
hermaphroditic individuals consistently produce more offspring and 
enjoy greater reproductive success after they have changed sex’®. 
Thus, the timing and the direction of sex change are crucial species- 
specific factors that determine demographic resilience to disturbancein 
sex-changing organisms. 

Amechanistic understanding of these and other ecologically impor- 
tant sex-based responses enables more accurate modelling of the effects 
of environmental variability, climate change or anthropogenic distur- 
bance (for example, overfishing) at a population level. Sex-specific 
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effects of climate change stressors on sex determination mechanisms, 
particularly incommercially important species, have potentially impor- 
tant implications for humans with respect to aquatic food production, 
ecosystem services and biodiversity. Incorporating sex analysis into 
marine science—and the natural sciences more widely—enhances 
research excellence and opportunities for discovery. 


Targeted human therapeutics 

Sex analysis also reveals opportunities for human drug development. 
Inthe areas of pain and depression, the discovery of sex differences in 
molecular pathways has signalled new directions for targeted thera- 
pies®®. Pain research that uses experimental mouse models of chronic 
pain shows that male and female mice withdraw from painful stimuli 
ina similar fashion, except when the contribution of microglial cells is 
inhibited”. Microglia are specialized immune cells located exclusively in 
the spinal cord and the brain. Inhibitors of microglia reduce pain sensing 
in male—but not female—mice, underscoring the potential importance 
of sex-dependent molecular pain pathways. Mouse models of depres- 
sion also show sexually divergent networks in the brain with distinct 
patterns of stress-induced gene regulationin males and females*°. These 
findings have now been reproduced in human postmortem tissue and 
may provide insights into why males and females with major depres- 
sive disorder respond differently to treatment with antidepressants*”. 

Although sex-specific dosages are rare, a few already exist. Such is 
the case for the drug desmopressin that activates vasopressin recep- 
tors inthe kidney to regulate water homeostasis. Because the gene for 
the arginine vasopressin receptor is found on the X chromosome in 
a region that is likely to escape X-chromosome inactivation, women 
are more sensitive to the antidiuretic effects of vasopressin than men, 
who have only one X chromosome and therefore only one copy of the 
vasopressin receptor gene per cell*. As aresult, older women who take 
desmopressin are more likely to experience a reduced sodium concen- 
tration inthe blood than men, which corresponds toa higher incidence 
of side effects in women. To avoid unnecessary harm, boththe European 
Union and Canada have recommended lower dosages for older women 
taking desmopressin. 

Even cancer immunotherapy is benefitting from a deeper under- 
standing of previously recognized genetic and hormone-mediated sex 
differences inimmunity. Patients with melanoma or lung cancer, who 
are treated with checkpoint inhibitors, respond differently based on 
their sex, witha higher proportion of male than female patients achiev- 
ing successful remission’. Designed to outsmart the defence tactics 
of the cancer cells, checkpoint inhibitors stimulate natural killer cells 
to attack tumour cells. Natural killer cells are sensitive to oestrogen 
and testosterone, which may explain these observed sex differences. 
Understanding the underlying mechanisms will enable us to fine-tune 
future therapies”. 

We expect to see an exponential rise in biomedical discoveries now 
that new computational biology and statistical genetics software facili- 
tates the exploration of X-chromosome-related expression in complex 
diseases*’. Until recently, sex chromosomes were excluded from most 
genome-wide association studies because of the difficulty in distinguish- 
ing the active from the inactive X chromosome in females, and because 
of a mismatch in chromosomal size***—the X chromosome has 1,669 
known genes andthesmaller Y chromosome contains only 426. Including 
sex chromosomes in genome-wide association studies, as well as includ- 
ing and analysing adequate numbers of female and male cells, tissues, 
animals and humans in research, will broaden our understanding of why 
women and menare affected differently by certain diseases and how we 
can adapt life-saving therapies to their specific needs. 


Engineering for equality 
Anoften neglected but crucial component of engineering is to under- 
stand the broader social impacts of the technology being developed and 
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to ensure that the technology enhances social equality by benefitting 
diverse populations. Human bias and stereotypes can be perpetuated, 
andeven amplified, when researchers fail to consider how human prefer- 
ences and assumptions may consciously or unconsciously be built into 
science or technology. Gender norms, ethnicity and other biological 
and social factors shape and are shaped by science and technologyina 
robust cultural feedback loop*®. This section discusses examples from 
product design, artificial intelligence (Al) and social robotics to illustrate 
how sex and gender analysis can enhance excellence in engineering. 


Designing safer products 

When products are designed based on the male norm, there is a risk 
that women and people of smaller stature will be harmed. Motor vehi- 
cle safety systems provide one such example. Because male drivers 
have historically been overrepresented in traffic data, seatbelts and 
airbags have been designed and evaluated with a focus on the typical 
male occupant with respect to anthropometric size, injury tolerance 
and mechanical response of the affected body region. When national 
automotive crash data from the United States were analysed by sex 
between 1998 and 2008, data revealed that the odds fora belt-restrained 
female driver to sustain severe injuries were 47% higher than those for 
abelt-restrained male driver involved ina comparable crash, after con- 
trolling for weight and body mass“. The subsequent introduction ofa 
virtual female car crash dummy allowed mathematical simulations to 
account for the effect of acceleration on sex-specific biomechanics, 
highlighting the need to add a medium-sized female dummy model to 
regulatory safety testing***’. Beyond automotive safety systems, the 
importance of anthropometric characteristics, such as the carrying 
angle of the elbow or the shape and size of the human knee, can be used 
to guide sex-specific design for artificial joints, limb prostheses and 
occupational protective gear". 


Reducing gender bias in Al 

Alarming examples of algorithmic bias are well documented™”. When 
translating gender-neutral language related to science, technology, 
engineering and mathematics (STEM) fields, Google Translate defaults 
to male pronouns”. When photographs depict a man in the kitchen, 
automated image captioning algorithms systematically misidentify 
the individual as awoman™. As Al becomes increasingly ubiquitous in 
everyday lives, such bias, if uncorrected, can amplify social inequities. 
Understanding how gender operates within the context of the algorithm 
helps researchers to make conscious decisions about how their work 
functions in society. 

Since the Second World War, medical research has been submitted to 
stringent review processes aimed at protecting participants from harm. 
Al, which has the potential to influence human life at scale, has yet to be 
so carefully examined. Numerous groups have articulated ‘principles’ 
for human-centred Al. These include, most importantly, the UN Human 
Rights Framework that consists of internationally agreed upon human 
rights laws and standards, as well as the ‘Asilomar Al Principles’, ‘Al at 
Google: Our Principles’, ‘Partnership on Al’, and so on. What we lack 
are mechanisms for technologists to put these principles into practice. 
Here we delve into a few of such rapidly developing mechanisms for Al. 

A first challenge in algorithmic bias is to identify when it is appropri- 
ate for an algorithm to use gender information. In some settings, such 
as the assignment of job ads, it might be desirable for the algorithm to 
explicitly ignore the gender of an individual as well as features such as 
weight, which may correlate with gender but are not directly related to 
job performance. In other applications, suchasimage/voice recognition, 
it might be desirable to leverage gender characteristics to achieve the 
best accuracy possible across all subpopulations. To date, there is no 
unified definition of algorithmic fairness® *, and the best approachis 
to understand the nuances of each application domain, make transpar- 
ent howalgorithmic decision-making is deployed and appreciate how 
bias can arise™®. 


Training data are a source of potential bias in algorithms. Certain 
subpopulations, such as darker-skinned women, are often underrep- 
resented in the data used to train machine-learning algorithms, and 
efforts are underway to collect more data from such groups’. To high- 
light the issue of underrepresented subpopulations in machine-learning 
data, researchers have designed ‘nutrition labels’ to capture metadata 
about how the dataset was collected and annotated“. Useful metadata 
should summarize statistics on, for example, the sex, gender, ethnicity 
and geographical location of the participants in the dataset. In many 
machine-learning studies, the training labels are collected through 
crowdsourcing, and it is also useful to provide metadata about the 
demographics of crowd labellers. 

Another approach to evaluate gender bias in algorithms is counter- 
factual analysis. Consider Google Search, in which menare five times 
more likely than women to be offered ads for high-paying executive 
jobs®. The algorithm that decides which ad to show inputs features 
about the individual making the query and outputs a set of ads predicted 
to be relevant. The counterfactual would test the algorithm in silico by 
changing the gender of each individual in the data and then studying 
how predictions change. If simply changing an individual from ‘woman’ 
to ‘man’ systematically leads to higher paying job ads, then the predic- 
tor is—indeed—biased. 

Work to debias word embeddings is another example of counterfac- 
tual analysis. Word embeddings associate each English word witha vec- 
tor of features so that the geometry between the feature vector captures 
semantic relations between the words. It is widely used in practice for 
applications such as sentiment analysis®, language translation® and 
analysis of electronic health records”. It has previously been shown that 
gender stereotypes—for example, men are more likely to be computer 
scientists—are manifested in the feature vectors of the corresponding 
words™. Whether this association between man and computer is prob- 
lematic depends on the application of the features. To test for gender 
effects, gender-neutral word features were created. For each down- 
stream application, counterfactual analysis can then be performed by 
running the application twice, once using the original word features, 
and once using the gender-neutral features. If the outcome changes, 
the algorithm is sensitive to gender. In some applications, such as job 
searches, it might be preferable to use gender-neutral features. 

An alternative approach to quantify and reduce gender bias in 
algorithms is called multi-accuracy auditing®*™. In standard machine 
learning, the objective is to maximize the overall accuracy for the entire 
population, as represented by the training data. In multi-accuracy, the 
goal is to ensure that the algorithm achieves good performance not 
only inthe aggregate but also for specific subpopulations—for example, 
‘elderly Asian man’ or ‘Native American woman’. The multi-accuracy 
auditor takes acomplex machine-learning algorithm and systematically 
identifies whether the current algorithm makes more mistakes for any 
subpopulation. In a recent paper, the neural network used for facial 
recognition was audited and specific combinations of artificial neurons 
that responded to the images of darker-skinned women were identified 
that are responsible for the misclassifications”. 

The auditor also suggests improvements when it identifies such 
biases”. Although achieving equal accuracy across all demographic 
groups may not always be feasible, these auditing techniques improve 
the transparency of the Al systems by quantifying how its performance 
varies across race, age, sex and intersections of these attributes. 

These are only a few of the specific techniques computer scientists 
are developing to promote gender fairness in algorithms. Some, such 
as data checks, are relevant across all disciplines that amass and analyse 
big data. Others are specific to machine learning, whichis now widely 
deployed across broad swathes of intellectual endeavours from the 
humanities to the social sciences, biomedicine and judicial systems. In 
allinstances, itis important to be completely transparent where and for 
what purpose Al systems are used, and to characterize the behaviour of 
the system with respect to sex and gender”. 
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Fig. 2| Sex analysis and reporting in science and engineering. This decision 
tree represents a cognitive process for analysing sex. A ‘no’ indicates no further 
analysis is necessary. A ‘yes’ suggests the next step that should be considered. 


Combatting stereotypes 

Analysing gender in software systems is one issue; configuring gender 
in hardware—such as social robots—is another, and the focus of this 
section. Until recently, robots were largely confined to factories. Most 
people never see or interact with these robots; they do not look, sound 
or behave like humans. But engineers are increasingly designing robots 
to assist humansas service robots in hospitals, elder care facilities, class- 
rooms, homes, airports and hotels. The field of social human-robot 
interaction examines, among other things, when and how ‘gendering’ 
robots, virtual agents or chatbots might enhance usability while, at the 
same time, considering when and howto avoid oversimplifications that 
may reinforce potentially harmful gender stereotypes”. 

Machines are, in principle, genderless. Gender, however, is a core 
social category inhuman impression formation that is readily applied 
tononhuman entities”. Thus, users may consciously or unconsciously 
gender machines asa function of anthropomorphizing them, even when 
designers intend to create gender-neutral devices”. 

Anthropomorphizing technologies may help users to engage more 
effectively with them, which poses the question as to whether there 
are benefits to tapping into the power of social stereotypes by build- 
ing gender into virtual agents” ®*°, chatbots* or social robots" *>*°, For 
example, if roboticists deploy female carebots in female-typical roles, 
suchas nursing, would users better comply withthe robot’s requests to 
take daily medication or to exercise? Does gendering robots or virtual 
agents facilitate interaction or boost objective outcomes suchas perfor- 
mance!®°-*!? Will personalizing robots or chatbots by gender increase 
consumer acceptance and, even, sales figures? Systematic empirical 
research is needed to address these open research issues. 

What features lead humans to gender a robot? So far, experimental 
research designed to analyse robot gender has manipulated gender ina 
number of ways, including (1) by choosing a male or female name to label 
the robot®” ”; (2) by colour-coding the robot™*; (3) by manipulating 
visual indicators of gender (for example, face, hairstyle or lip colour”); 
(4) by adding a male or female voice, or low or high pitch to simulate 
this, respectively®” °?*°*; (5) by designing a gendered personality®*”’; 
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and (6) by deploying robots in gender-stereotypical domains, suchasa 
male-voiced robot for security and a female-voiced robot inahealthcare 
role®. Other aspects, such as movements or gestures, that may poten- 
tially gender a robot still require empirical research®*°, 

But there are dangers here. As soon as designers or users assign a 
gender toa machine, stereotypes follow. Designers of robots and Aldo 
not simply create products that reflect our world, they also (perhaps 
unintentionally) reinforce and validate certain gender norms that are 
considered to be appropriate for men, women or gender nonconform- 
ing individuals””*. 

Eliciting gendered perceptions of technologies implies actively 
designing human gender biases, including binary constructions of 
gender as male or female, into machines. From a social psychologi- 
cal viewpoint, this can contribute to stereotypical gender norms in 
society”. Even though this might not seem relevant from an engineer- 
ing point of view, social psychological research would suggest that a 
robot with a female appearance, for example, may perpetuate ideas of 
women as nurturing and communal, traits stereotypically associated 
with women”. Thus, a female robot may be deemed socially warm and 
particularly suitable for stereotypically female tasks, such as elderly 
care, or it might be openly sexualized and objectified as revealed in 
abusive commentary on video clips of female robots in recent qualita- 
tive research”. Similarly, virtual personal assistants with female names, 
voices and stereotypical, submissive behaviours, suchas Siri or Alexa, 
represent heteronormative ideas about females and thereby indirectly 
contribute to the discrimination of women in society’. Aninteresting 
development inthis regard is the genderless voice, Q, which has recently 
been developed in Denmark to overcome such bias”. 

There are many questions regarding these features. How, for example, 
douser attributes, such as age or gender, interact with different robot 
design features? How do robots enhance or harm real-world attitudes 
and behaviours related to social equality? How does robot gender elicit 
different responses across cultures? More experimental, laboratory 
and longitudinal field research is needed to test whether, and how, a 
machine’s gendered, gender-diverse or gender-neutral appearance 
or behaviour influences human affect, cognition and behaviour. It is 
likely that even social robots designed to be genderless or gender neu- 
tral elicit gender attributions owing to the relatively automatic nature 
of anthropomorphizing humanoid robots. It is also likely that when 
potential end users are offered the option to select a digital assistant’s 
gender, their choice will be driven by their own gender identity and 
gender-related attitudes and stereotypes. Addressing these research 
questions and issues remains important to shed light on the psychologi- 
cal, social and ethical implications of implicit or explicit design choices 
for novel technologies. 

Developing technologies that enhance, or at least do not harm, social 
equality will require novel configurations of researchers. Muchattention 
has been paid to the need for interdisciplinary research, consisting of 
humanists, legal experts, technologists and social scientists, especially 
in the fields of human-centred Al. The historical development of uni- 
versities, however, has artificially separated human knowledge into 
disciplines over the course of the nineteenth and twentieth centuries 
that may not support current research needs. Research institutions now 
need to develop robust mechanisms to bring together social analysis 
and engineering ina way that rigorously addresses the emerging needs 
of society’. 


Pathways to improving study design 

To reach the full potential of sex and gender analysis for discovery and 
innovation, itis important to integrate sex and gender analysis, where 
relevant, into the design of research from the very beginning. Much 
of science and engineering research is path-dependent: once research 
has been designed, it becomes difficult to change. It is also important to 
understand that sex and gender are categories of analysis or variables (or 
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Fig. 3| Gender analysis and reporting in science and engineering. This 
decision tree represents a cognitive process for analysing gender. A‘no’ 
indicates no further analysis is necessary. A ‘yes’ suggests the next step that 
should be considered. 


controls) that need to be incorporated into the research process, but do 
not need be the main focus of the research. Nor will sex and gender analy- 
sis be relevant to all types of research. As the decision trees for analysing 
sex (Fig. 2) and gender (Fig. 3) indicate, in cases in which researchers have 
considered sex and/or gender but judge that this analysis is not relevant 
for aspecific hypothesis, they may rule it out. Moreover, if researchers 
expect sex or gender to be important but find no significant differences, 
this may represent a result worthy of publication. Reporting cases in 
which sex or gender sameness, overlap or no difference is found may 
represent an important finding. 

Inthis Perspective, we highlight the need and promise for designing 
sex and gender analysis into research through specific case studies and 
examples. From these, we extracted key considerations for analysing 
sex (Fig. 2) and gender (Fig. 3). These are generic recommendations that 
work across disciplines. However, more related studies are needed in 
the next five years. First, through interdisciplinary work, researchers 
need to sharpen and standardize generic approaches to sex and gender 
analysisthat generalizeacross fields. Second, through discipline-specific 
work, researchers need to craft state-of-the-art analytics for study design 
and data analysis in their own subfields. The European Commission is 
currently funding an expert group that seeks to tailor sex and gender 
methods of analysis to field-specific protocols’. 


Future challenges 

We do not yet have results for sex and gender analysis in the physical 
sciences, suchas basic chemistry, pure physics, geology or astronomy. 
Much work has analysed gender gaps in participation and gender bias 
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Fig. 4| Three pillars of science and engineering infrastructure. To reap the 
benefits of sex and gender analysis, the pillars of science infrastructure must 
develop and implement coordinated policies. 


in the culture of these fields, but attention has yet to turn to how the 
research itself may respond to gender analysis. As research in the physi- 
cal sciences becomes more applied, sex and gender analysis become 
more relevant—for example, in the chemistry of aerosols, sex differ- 
ences govern rates of inhalation and gender differences influence rates 
of exposure”. 

Several methodological challenges remain for the field of sex and 
gender analysis itself. Although advances have been made in methods 
for analysing sex’, we lack non-invasive methods of sex determination 
in numerous non-model organisms, in which sexual morphological 
dimorphism is not easily detected. Technological advances through 
the development of genetic”, metabolomic'’ and endocrine? mark- 
ers of organism sex are needed for non-model species at all stages of 
development, an endeavour that will be aided by the innovation and 
increased affordability of omics approaches. Attention will also need 
to be paid to the translation of evidence from animal species to humans 
as—in many cases—molecular sex differences observed in humans may 
not be mirrored innonhuman mammals”. 

Although sex as a biological variable in science and engineering is 
increasingly well understood”®, the same cannot be said for gender as 
acultural variable. Gender is complex and multidimensional (Facebook 
introduced 58 gender categories in 2014") and applications in technical 
fields often require collaboration with social scientists to understand the 
relevant aspects of gender for specific projects. Even in health research, 
we lack systematic measures for assessing how gender relates to health 
because gender does not reduce easily to variables that can be manipu- 
lated statistically. Two recent studies have attempted to remedy this. 
The first used a binary gender index (masculinity versus femininity) 
constructed from seven variables and found that the incidence of recur- 
rence and death 12 months after diagnosis of acute coronary syndrome 
in young adults was associated with gender and specifically not with 
biological sex". A second study under development at Stanford Uni- 
versity seeks to capture the multidimensionality of gender better by 
identifying theoretically robust gender-related variables relevant for 
health research. This study is based on US data, and new variables tai- 
lored to specific cultural settings need to be identified. Developing 
measures of gender is clearly an area for which more researchis needed. 

Other methodological challenges include going beyond the binary— 
female and male, women and men-—in bothsex and gender analysis. Take, 
for instance, the Gender API algorithm that allows social scientists to 
understand gender differencesin research patterns. The algorithm iden- 
tifies only binaries: female/male; woman/man. Inthe United States, 0.6% 
of the population—nearly 2 million people—identify as transgender”, 
and morethan15 countries offer a third sex category onlegal documents, 
birth certificates and passports. Research needs to keep pace with social 


change. Similarly, consider the lack of research that addresses how her- 
maphroditic animals respond to environmental change. Insimultaneous 
hermaphrodites in which reproductively mature individuals have both 
male and female gametes, there is a need to consider the role of male 
or female tissues in determining the response of the whole organism. 
By contrast, in sequential hermaphrodites that change sex, there is a 
need to consider whether an organism responds as a female or a male 
to environmental stress during the sex change process, given that this 
process is dynamic, with behavioural, endocrine and genetic systems 
switching sex on markedly different timescales. 

Additional challenges include accounting for other social variables, 
such as age, race and geographical location, and how these intersect 
with sex and/or gender. Sex or gender cannot be isolated from other 
characteristics, and we need model systems and intersectional methods 
to understand these interrelationships”». An intersectional approach 
in human research underscores the importance of unmasking and 
rectifying overlapping and interdependent systems of discrimination 
that are often built into knowledge, programs and policies. Benefits 
for global health, for example, will only be achieved when unbiased 
decision-making about resources takes into account the lived experi- 
ences of women and men with multiple identity characteristics who 
simultaneously suffer from race, class, education, economicand cultural 
power imbalance in accessing food and water, digital technology and 


healthcare services”. 


Science policy 

Policy is one driver of discovery and innovation that can enable sex and 
gender analysis in science and technology. To push forward rigorous 
sex and gender analysis, interlocking policies need to be implemented 
by three pillars of academic research: funding agencies, peer-reviewed 
journals and universities (Fig. 4). 

Government-led funding agencies have taken the lead by asking appli- 
cantsto explain howsex and gender analysis is relevant to their proposed 
research, or to explain that it is not (for a list of agencies and policies, 
see Supplementary Information section 1). The Canadian Institutes of 
Health Research showed robust uptake after mandating applicants to 
declare whether sex and/or gender were accounted for in proposals 
and to justify exclusion in 2010. Their evaluation revealed that from 
2010-2011 the proportion of funded proposals incorporating sex and/ 
or gender analysis nearly doubled””"*. 

The second pillar, peer-reviewed journals, have developed editorial 
policies advocating for sex or gender analysis to ensure excellence in 
papers selected for publication (for a list of journals and policies, see 
Supplementary Information section 2). Uptake has been swift in health 
and medicine. The Lancet, for example, adopted such guidelines in 2016, 
followed quickly by the International Committee of Medical Journal 
Editors”. The Structured, Transparent, Accessible Reporting (STAR) 
methods of Cell Press have required transparent reporting of the sex dis- 
tribution of donor cells, also since 2016. Importantly, the widely adopted 
Sex and Gender Equity in Reporting (SAGER) guidelines recommend that 
data be disaggregated by both sex and gender”. Although biomedical 
journals have moved rapidly, we are not aware of any engineering or 
computer science conferences or journals with such guidelines. 

Pillarsoneandtwoneedthesupport of athird pillar: universities. Both 
funding agencies andjournals may have policies in place, but researchers 
and evaluators by and large lack expertise in sex and gender analysis. 
The European Commission, which has had policies in place since 2014, 
found that fewer than expected funded research proposals incorpo- 
rated sex and gender analysis and has correlated this low proportion 
to an ‘absence of training on gender issues’. Similarly, an analysis of 
animal research in the neurosciences showed that in 2014 only about 
14% of peer-reviewed articles considered sex as a biological variable”. 

Universities need to step up and incorporate sex and gender analysis 
as aconceptual tool into science and engineering curricula. Numerous 
universities offer gender analysis inthe humanities and social sciences, 
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but not in core natural science and engineering courses. Efforts have 
been made in medicine—the Charité in Berlin, Germany, for instance, 
has successfully integrated sex and gender analysis throughout all six 
years of medical training from early basic science to later clinical mod- 
ules’, However, this is a rare example, and universities must do more 
to prepare the scientific workforce for the future. 

Several initiatives have endeavoured to fill this gap. Gendered Innova- 
tions—a global, collaborative project initiated from Stanford University 
in2009 and supported by the European Commission and the US National 
Science Foundation—has developed practical methods of sex and gender 
analysis for natural scientists and engineers, and provides case studies 
as concrete illustrations of how sex and gender analysis lead to discovery 
and innovation (https://genderedinnovations.stanford.edu/). The WHO 
(World Health Organization) has developed a gender-responsive assess- 
ment tool’. The Organization for the Study of Sex Differences (https:// 
www.ossdweb.org/) has advanced sex and gender analysis methods for 
the life and health sciences. The Canadian Institutes of Health Research 
have developed online training modules for integrating sex and gender 
analysis into biomedical research”>. These initiatives should now be 
mainstreamed into university education. 

Much work remains to be done to systematically integrate sex and 
gender analysis into relevant domains of science and technology—from 
strategic considerations for establishing research priorities to guide- 
lines for establishing best practices in formulating research questions, 
designing methodologies and interpreting data. To make real progressin 
the next decade, researchers, funding agencies, peer-reviewed journals 
and universities need to coordinate efforts to develop and standardize 
methods of sex and gender analysis. 

But eyes have been opened, and by integrating sex and gender analysis 
into their work, researchers can enhance excellence and social respon- 
sibility in science and engineering. 
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Elastic electron-proton scattering (e—p) and the spectroscopy of hydrogen atoms are 
the two methods traditionally used to determine the proton charge radius, r,. In 2010, 
anew method using muonic hydrogen atoms’ found a substantial discrepancy 
compared with previous results”, which became knownas the ‘proton radius puzzle’. 
Despite experimental and theoretical efforts, the puzzle remains unresolved. In fact, 
there is a discrepancy between the two most recent spectroscopic measurements 
conducted on ordinary hydrogen**. Here we report on the proton charge radius 
experiment at Jefferson Laboratory (PRad), a high-precision e—p experiment that was 
established after the discrepancy was identified. We used a magnetic-spectrometer- 


free method along with a windowless hydrogen gas target, which overcame several 
limitations of previous e—p experiments and enabled measurements at very small 
forward-scattering angles. Our result, r, = 0.831 + 0.007... + 0.012,,,. femtometres, is 
smaller than the most recent high-precision e-p measurement? and 2.7 standard 
deviations smaller than the average of all e-p experimental results°. The smaller r, we 
have now measured supports the value found by two previous muonic hydrogen 
experiments”. In addition, our finding agrees with the revised value (announced in 
2019) for the Rydberg constant®—one of the most accurately evaluated fundamental 


constants in physics. 


The protonis the dominant component of visible matter in the Universe. 
Consequently, determining the proton’s basic properties—such as its 
root-mean-square charge radius, r,—is of interest in its own right. Accu- 
rate knowledge of r, is also important for the precise determination 
of other fundamental constants, such as the Rydberg constant (R..)’. 
The value of r, is also required for precise calculations of the energy 
levels and transition energies of the hydrogen atom—for example, the 
Lamb shift. In muonic hydrogen (uH atoms), in which the electron in 
the H atom is replaced by a ‘heavier electron’ (a muon), the extended 
proton charge distribution changes the Lamb shift by as much as 2%. 
The first-principles calculation of r, from the accepted theory of the 
strong interaction (quantum chromodynamics, QCD), is notoriously 
challenging and currently cannot reach the accuracy demanded by 
experiments, but lattice QCD calculations are on the cusp of becoming 
precise enough to be tested experimentally’. Therefore, the precise 
measurement ofr, is not only critical for addressing the proton radius 


puzzle but also important for determining certain fundamental con- 
stants of physics and testing lattice QCD. 

Prior to 2010 the two methods used to measure r, were ep > ep elastic 
scattering measurements, in which the slope of the extracted proton 
(p) electric (E) form factor, GP, as the four-momentum transfer squared 
(Q’) approaches zero, is proportional tor; and Lamb shift (spectroscopy) 
measurements of ordinary H atoms, which, along with state-of-the-art 
calculations, can be used to determine r,. Although the e-p results can 
be somewhat less precise than the spectroscopy results, until 2010 the 
values of r, obtained from these two methods” mostly agreed with each 
other”. Since that year, two new results based on Lamb shift measure- 
ments in pH were reported’”. The Lamb shift in 1H is several million times 
more sensitive to r, because the muon in a JH atom is about 200 times 
closer to the proton than is the electron in aH atom. To the surprise of 
both the nuclear and atomic physics communities, the two pH results”, 
displaying unprecedented precision with an estimated uncertainty of 
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Fig. 1| The PRad experimental setup. A schematic layout of the PRad 
experimental setup in Hall B at Jefferson Laboratory, with the electron beam 
incident from the left. The key beam-line elements are shown along with the 


<0.1%, combined to be eight standard deviations smaller than the aver- 
age value obtained from all previous experiments. This became known 
as the proton radius puzzle”, unleashing intensive experimental and 
theoretical efforts aimed at resolving the disagreement. 

The discrepancy between the values of r, as measured in H and pH 
atoms remains unresolved. Moreover, the two most recent H spectros- 
copy measurements disagree with each other**, which has added a new 
dimension to and renewed the urgency of this problem. A fundamental 
difference between the e-p and p-p interactions could be the origin of 
the discrepancy; however, there are abundant experimental constraints 
on any such ‘new physics’, although models that resolve the puzzle by 
invoking new force carriers have been proposed”’”. More mundane solu- 
tions continue to be explored: for example, it has been rigorously shown 
that the definition of r, used in all three major experimental approaches 
was consistent™. The effect of two-photon exchange on pH spectros- 
copy”, and form-factor nonlinearities in e-p scattering’* 8 have also 
been examined. None of these studies has adequately explained the puz- 
zle, reinforcing the need for additional high-precision measurements 
of r, that use new experimental techniques and different systematics. 

The PRad collaboration at Jefferson Laboratory has developed and 
performed an e-p experiment as an independent measurement of r, 
to address the puzzle. The PRad experiment, in contrast with previous 
e-p experiments, was designed to use a magnetic-spectrometer-free, 
calorimeter-based method”. The design of the PRad experiment imple- 
mented three major improvements over previous e—p experiments. 
First, the large angular acceptance (0.7°-7.0°) of the hybrid calorimeter 
(HyCal) enabled large Q’ coverage, spanning two orders of magnitude 
(2.1 x 10+ GeV/c? to 6 x 10 GeV?/c?, where cis the speed of light ina 
vacuum) in the low Q? range. The fixed location of HyCal eliminated 
the many normalization parameters that plague magnetic-spectrom- 
eter-based experiments in which the spectrometer must be physically 
moved to many different angles to cover the desired range of Q*. In 
addition, the PRad experiment reached extreme forward-scattering 
angles of downto 0.7°, achieving a Q’ value of 2.1 x 10 * GeV/c’; this is, 
to our knowledge, the lowest Q’ obtained from e-p experiments and 
is an order of magnitude lower than that previously achieved®. Reaching 
alower range for Q’ is critical because r, is determined from the slope 
of the electric form factor at Q? = 0. Second, the extracted e-p cross- 
sections were normalized to the well-known quantum electrodynam- 
ics processe e >e e (Moller scattering from atomic electrons, e-e), 
which was measured simultaneously alongside e-p scattering, using 
the same detector acceptance. This led toa substantial reductioninthe 
systematic uncertainties of measuring the e-p cross-sections. Third, the 
background generated from the target windows, one of the dominant 
sources of systematic uncertainty in all previous e-p experiments, was 
highly suppressed in the PRad experiment. 
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windowless hydrogen gas target, the two-segment vacuum chamber and the two 
detector systems (see the Methods fora brief overview and the Supplementary 
Information for a description of the target and individual detectors). 


The PRad experimental apparatus consisted of four main elements 
(Fig. 1). (1) A4-cm-long windowless cryo-cooled hydrogen gas flow target 
with an areal density of 2 x 10" atoms per cm, which eliminated the beam 
background from the target windows. (2) The high-resolution, large- 
acceptance hybrid electromagnetic calorimeter, HyCal”°. The complete 
azimuthal coverage of HyCal for the forward-scattering angles enabled 
simultaneous detection of the pair of electrons from e-e scattering. 
(3) A plane made of two high-resolution X-Y gas electron multiplier 
(GEM) coordinate detectors located in front of HyCal. (4) Atwo-section 
vacuum chamber spanning the 5.5-m distance from the target to the 
detectors. 

The PRad experiment was performed in Hall B at Jefferson Laboratory 
in May-June of 2016, using 1.1-GeV and 2.2-GeV electron beams. The 
standard Hall B beam line, designed for low beam currents (0.1-50 nA), 
was used in this experiment. The incident electrons that scattered off 
the target protons and the Moller electron pairs were detected in the 
GEM detector and HyCal. The energy and position of the detected 
electron(s) were measured by HyCal, and the transverse (X-Y) posi- 
tion was measured by the GEM detector, which was used to assign the 
Q’ for each detected event. The GEM detector, which has a position 
resolution of 72 um, improved the measurement accuracy of Q* com- 
pared to detection by HyCal alone. Furthermore, the GEM detector 
suppressed the contamination from photons generated in the target 
and other beam-line materials; HyCal is equally sensitive to electrons 
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Fig. 2| Event reconstruction. The reconstructed energy versus angle for e-p 
and e-e events for an electron beam energy of 2.2 GeV. The red and black lines 
indicate the event selections for e-p and e-e, respectively. The angles <3.5° are 
covered by the crystal P»bWO, modules of HyCal and the larger angles by the Pb 
glass modules. The colour bar shows the number of events. 
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Fig. 3 | The measured cross-section and form factor. a, The reduced cross- 


section, d,.quced= (52) | (5) pene MRE’ TENA} + a’) (where Fis the 
electron beam energy, E’ is the energy of the scattered electron, M, is the mass 
of the proton and Qis the solid angle subtended by the scattered electron 
detector), for the PRad e-p data. Dividing out the kinematic factor inside the 
square brackets, O,equceq iS. a linear combination of the electromagnetic form 
factors squared. The bands at the bottom of the plot are the size of the 
systematic uncertainties, for 1.1 GeV (red) and 2.2 GeV (blue). The error bars 
show statistical uncertainties. b, GP as a function of Q’. The data points are 
normalized by the parameter nin equation (1) for the 1.1-GeV and 2.2-GeV data, 
labelled as n,andn,, respectively. The error bars show statistical 
uncertainties. The bands are the systematic uncertainties as ina. The solid 
black curve shows GP(Q?)as a fit to the function given by equation (1). Also 
shownis the fit froma previous e-p experiment’, giving r, = 0.883(8) fm 
(green dashed line) and another previous calculation® giving r,=0.844(7) fm 
(purple dot-dashed line). 


and photons, whereas the GEM detector is mostly insensitive to neutral 
particles. The GEM detector also helped to suppress position-depend- 
ent irregularities in the response of HyCal. A plot of the reconstructed 
energy versus the reconstructed angle for e-p and e-e events is shown 
in Fig. 2 for the 2.2-GeV beam energy. 

The background was measured periodically with an empty target cell. 
To mimic the residual gas in the beam line, H, gas at very low pressure 
was allowed in the target chamber during the empty target runs. The 
charge-normalized e-p and Moller scattering yields from the empty 
target cell were used to subtract the background contributions. The 
beam current was measured with the Hall B Faraday cup with an uncer- 
tainty of <0.1%”. Further details on the background subtraction can be 
found in the Supplementary Information. 


A comprehensive Monte Carlo simulation of the PRad setup was 
developed using the Geant4 toolkit”. The simulation consists of two 
separate event generators built for the e-p and e-e processes”>™. Inelas- 
tic e—p scattering background events were also included in the simula- 
tion using a fit” to the e-p inelastic world data. The simulation included 
signal digitization and photon propagation, which were critical for the 
precise reconstruction of the position and energy of each event inthe 
HyCal. The details are described in the Supplementary Information. 

The e-p cross-sections were obtained by comparing the simulated 
and measured e-p yield relative to the simulated and measured e-e 
yield (see Supplementary Information for details). The extracted 
reduced cross-section is shown in Fig. 3a. The e-p elastic cross-section 
is related to GP and the proton magnetic form factor, Gf, by the Rosen- 
bluth formula”. In the very low Q’ region covered by the PRad experi- 
ment, the cross-section is dominated by the contribution from GP. 
Thus, the uncertainty introduced from Gf, is negligible. In fact, when 
using a wide variety of parametrizations>”®*8 for Gj, the extracted GP 
varies by about 0.2% at Q?= 0.06 GeV?/c’, the largest Q” accessed by the 
PRad experiment, and by <0.01% in the Q* < 0.01 GeV/c’ region. The 
largest variation in r, arising from the choice of Gj, parametrization is 
0.001 fm. GP(Q?) as extracted from our data is shown in Fig. 3b, using 
the Kelly parametrization” for GR. 

The slope of GP(Q”)as Q? > Dis proportional to fr Acommon practice 
is to fit GP(Q”) toa functional form and to obtain r, by extrapolating to 
Q’ = 0. However, each functional form truncates the higher-order 
moments of GP(Q?) differently and introduces a model dependence 
that can bias the determination of r,. It is critical to choose a robust 
functional form that is most likely to yield an unbiased estimation of 
r, given the uncertainties in the data, and to test the chosen functional 
form over a broad range of parametizations” of GP(Q?). To simultane- 
ously minimize possible bias in the determination of the radius and 
the total uncertainty, various functional forms were examined for their 
robustness in reproducing an input r, used to generate a mock dataset 
with the same statistical uncertainty as the PRad data. The robustness, 
quantified as the root-mean square error (RMSE), is defined as 


RMSE =| (6R)? + o2, where Ris the bias or the difference between the 
input and extracted radius and ois the statistical variation of the fit to the 
mock data”’. Previous studies” show (see Supplementary Information) 
that consistent results with the smallest uncertainties can be achieved 
using a multi-parameter rational function, which we refer to as 
Rational(1, 1): 


1+p,Q’ 


1+p,Q? () 


F(Q’) = nGP(Q’)=n 
where nis the floating normalization parameter, p, and p, are fit param- 
eters and the proton charge radius is given by r, = [6(p, -p,)- TheGP(Q?), 
extracted from the 1.1-GeV and 2.2-GeV data, was fitted simultaneously 
using the Rational(1, 1) function. Independent normalization parameters 
n,and n, were assigned for the 1.1-GeV and 2.2-GeV data, respectively, to 
allow for differences in normalization uncertainties, but the Q? depend- 
ence was identical. The parameters obtained from fits to the Rational(1, 1) 
function are n, = 1.0002 + 0.0002,,4,+ 0.0020,,..; M2 = 0.9983 + 0.0002..ar 
+0.0013,,,,;and r, = 0.831 + 0.007,,..+ 0.012,,,, fm. The Rational(1, 1) func- 
tion describes the data very well, witha reduced x’ of 1.3 when considering 
only the statistical uncertainty. The values of r, for a variety of functional 
forms fitted to the PRad data are shown in Supplementary Fig. 15. 

To determine the systematic uncertainty inr,,a Monte Carlo technique 
was used to randomly smear the cross-section and GP(Q?)data points for 
each known source of systematic uncertainty. The value ofr, was extracted 
fromthe smeared data and the process was repeated 100,000 times. The 
root-mean square of the resulting distribution of r, is recorded as the 
systematic uncertainty. The dominant systematic uncertainties ofr, are 
those that are Q*-dependent, which primarily affect the lowest Q? data: 
the Moller radiative corrections, the background subtraction for the 


Nature | Vol575 | 7November 2019 | 149 


Article 


Fig. 4| The proton charge radius. r, as extracted 
from the PRad data in this work, shown alongside 
other measurements ofr, since 2010 and previous 
CODATA recommended values. Our result is 2.70 
smaller than the CODATA recommended value for 
e-p experiments®. The orange and blue vertical 
bands showthe uncertainty bounds of the pH and 
CODATA values for e-p scattering, respectively. 
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1.1-GeV data and event selection. The uncertainty in r, arising from the 
finite Q’ range and the extrapolation to Q?=0 was investigated by varying 
the Q’ range of the mock dataset as part of the robustness study of the 
Rational(1, 1) function”. This uncertainty was found to be much smaller 
than the relative statistical uncertainty, 0.8%. The total systematic relative 
uncertainty onr, was found tobe1.4%, andis detailed in Supplementary 
Table 1 and described in the Supplementary Information. 

The value of r, obtained using the Rational(1, 1) function is shown in 
Fig. 4, with statistical and systematic uncertainties summed in quadrature. 
Our result, obtained from Q’? down toanunprecedented 2.110 * GeV’/c’, 
is about three standard deviations smaller than the previous high-preci- 
sion electron scattering measurement>, which was limited to higher Q? 
(>0.004 GeV’/c”). However, our result is consistent with the pH Lamb-shift 
measurements”, and also with the recent 2S—4P transition-frequency 
measurement using ordinary H atoms’. Given that the lowest Q’ reached 
inthe PRad experiment is an order of magnitude lower than in previous 
e-p experiments, and owing to the careful control of systematic effects, 
our result indicates that the proton radius is smaller than its previously 
accepted value from e-p measurements. Our result does not support any 
fundamental difference between e-p and p-pinteractions and is consistent 
withthe updated value announced for the Rydberg constant by CODATA®. 

The PRad e-p experiment covers Q’ over two orders of magnitude 
in onesetting. The experiment also exploited the simultaneous detec- 
tion of e-p and e-e scattering to achieve good control of systematic 
uncertainties, which were, by design, different from previous e-p 
experiments. The extraction of r, using functional forms with vali- 
dated robustness is another strength of this result. Our result dem- 
onstrates a large discrepancy with contemporary, high-precision e—p 
experiments. The result also implies that there is consistency between 
proton charge radii as obtained from e-p scattering measurements on 
ordinary hydrogenand spectroscopy of muonic hydrogen”. The PRad 
experiment demonstrates the clear advantages of the calorimeter- 
based method for determining r, from e—p experiments and points to 
further possible improvements in the accuracy of this method. Itis also 
consistent with the recently announced shift in the Rydberg constant®, 
which has profound consequences, given that the Rydberg constant is 
one of the most precisely known constants of physics. 
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Methods 


The PRad experiment was conducted with 1.1-GeV and 2.2-GeV electron 
beams fromthe Continuous Electron Beam Accelerator Facility (CEBAF) 
accelerator incident on cold hydrogen atoms flowing through a window- 
less target cell. The scattered electrons, after traversing the vacuum 
chamber, were detected in the GEM detector and HyCal. They included 
electrons from elastic e-p scattering and e-e Moller scattering pro- 
cesses. The transverse (X-Y) positions measured by the GEM detector 
were used to calculate the Q’ value for each event. The e-p and e-e yields 
were obtained using appropriate cuts on the energy deposited in HyCal 
and the reconstructed angle. The e-p and e-e yields were binned as a 
function of Q?. A comprehensive Monte Carlo simulation of the PRad 
experiment was used to extract the next-to-leading order e-p cross- 
section from the experimental yields. The e—p cross-sections were 
obtained by comparing the simulated and measured e-p yield relative 
to the simulated and measured Moller scattering yield. The value of GP 
was extracted from the e-p cross-section using the Rosenbluth formula, 
and using a parametrization of Gf, The proton charge radius, r,, was 
obtained from the extracted GP(Q7) by fitting to the Rational(1, 1) func- 
tional form and extrapolating to Q? = 0. The Rational(1, 1) functional 
form was shown to be the most robust function for radius extraction 
from the PRad data, giving consistent results with the smallest uncer- 
tainties. See Supplementary Information for further details. 


Data availability 


The raw data from this experiment are archived in Jefferson Labora- 
tory’s mass storage silo. 
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All computer codes used for data analysis and simulation are archived 
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The fundamental parameters of majority and minority charge carriers—including 
their type, density and mobility—govern the performance of semiconductor devices 
yet can be difficult to measure. Although the Hall measurement technique is currently 
the standard for extracting the properties of majority carriers, those of minority 
carriers have typically only been accessible through the application of separate 
techniques. Here we demonstrate an extension to the classic Hall measurement—a 
carrier-resolved photo-Hall technique—that enables us to simultaneously obtain the 
mobility and concentration of both majority and minority carriers, as well as the 
recombination lifetime, diffusion length and recombination coefficient. This is 
enabled by advances in a.c.-field Hall measurement using a rotating parallel dipole 
line system and an equation, Ay,, = d(o°H)/do, which relates the hole-electron Hall 
mobility difference (Ay,,), the conductivity (0) and the Hall coefficient (H). We apply 
this technique to various solar absorbers—including high-performance lead-iodide- 
based perovskites—and demonstrate simultaneous access to majority and minority 
carrier parameters and map the results against varying light intensities. This 
information, which is buried within the photo-Hall measurement!”, had remained 
inaccessible since the original discovery of the Hall effect in 1879°. The simultaneous 
measurement of majority and minority carriers should have broad applications, 
including in photovoltaics and other optoelectronic devices. 


The Hall effect measurement is one of the most important characteriza- 
tion techniques for electronic materials, and the effect has become the 
basis of fundamental advances in condensed matter physics, suchas the 
integer and fractional quantum Hall effects**. The measurements reveal 
fundamental information about the majority charge carrier—thatis, its 
type (p orn), density and mobility. Ina solar cell, the parameters of the 
majority carrier determine the overall device architecture, the width 
of the depletion region and the bulk series resistance. The properties 
of the minority carrier, however, determine other key parameters that 
directly affect the overall performance of the device, such as recombi- 
nation lifetime (z), diffusion length (L,) and recombination coefficients 
(k,). Unfortunately, the standard Hall measurement yields information 
regarding only the majority carrier. Attempts to measure the prop- 
erties of both majority and minority carriers in high-performance 
light-absorbing materials have been made; however, they require a wide 
range of experimental techniques that typically use different sample 
configurations and illumination levels, thereby presenting additional 
complications in the analysis® “ (Supplementary Information sections 
F, G). The extraction of reliable information on charge carriers is par- 
ticularly sought after in the study of organic-inorganic hybrid perovs- 
kites. This family of materials is currently receiving intense attention, 
owing to rapid progress in their application in high-performance solar 
cells—the current record power conversion efficiency (PCE) for devices 
containing such materials is 25.2% °—as well as in other optoelectronic 
devices, including light-emitting diodes’ and photodetectors”. 


A full understanding of the charge-transport properties of perovskites 
will help to elucidate the operating principles of devices that contain 
these materials, thereby guiding their further improvement. 

In this work we present a carrier-resolved photo-Hall (CRPH) meas- 
urement technique that is capable of simultaneously extracting the 
mobilities, densities and subsequent derivative parameters (t, L,) of 
both majority and minority carriers as a function of light intensity. 
This technique relies on two key elements: an equation that yields the 
difference between the Hall mobilities of the hole and electron, anda 
high-sensitivity Hall measurement using a parallel dipole line (PDL) 
a.c. Hall system’® (Fig. 1a, b). In the classic Hall measurement without 
illumination, three parameters can be obtained for majority carriers: 
the type (p orn), from the sign of Hall coefficient H; the carrier density 
(n.=r/He); and the Hall mobility (4, =0H); where eis the electron charge 
and ris the Hall scattering factor. The key challenge in the photo-Hall 
transport problem—that is, extracting information from the majority 
and minority carriers—requires solving for three unknowns ata given 
illumination level: hole and electron (drift) mobility (4, zy) and their 
photocarrier densities (An, Ap), which are equal under steady-state 
conditions. Unfortunately, we have only two measured quantities: o 
and H, as a function of illumination. The key insight into solving this 
problem is illustrated in Fig. 1c. We consider two p-type systems with 
the same majority carrier density (p)) and mobility (u,) but different 
minority carrier mobilities (uy). When these systems are excited with 
the same photocarrier density, An,,,,,, they will produce different 0-H 
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Fig. 1| The carrier-resolved photo-Hall measurement. a, The PDL photo Hall 
setup for acomplete photo-Hall experiment. b, The rotating PDL magnet 
system that generates a unidirectional and single harmonic a.c. magnetic field 
at the centre (see animation in Supplementary Video 1). c, Theoretical 


curves owing to the increasing role of the minority carrier in the total 
conductivity, even though they start from the same point in the dark. 
Therefore, the characteristics of the o0-Hcurves—specifically the slope 
(dH/do)—contains detailed information about the mobilities of the 
two systems. We show that the Hall mobility difference, Au, =rAy= 
r(Up — Hy), is given as (Supplementary Information section B): 


_ d(o7H) _ dinH 
Aut, do a dino a () 


Note that oand Hare experimentally obtained as a function of varying 
light intensity or photocarrier density An; however, fortuitously, the An 
term cancels out of equation (1). There are two equivalent expressions 
for Aw,,in equation (1), which enable slope analysis for low and high injec- 
tion. The term dInH/dInohas special experimental meaning, as shown 
for the perovskite example discussed later. This equation applies 
to both p- and n-type materials and assumes that the dark carrier 
densities (p, or n,) are fixed, that An= Ap under steady-state conditions, 
and that the mobilities are constant as a function of light intensity 
(see Supplementary Information section B.2 for a generalized model 
in which mobilities vary with illumination). The Hall scattering factor 
rgenerally lies between 1 and 2. It approaches 1 at high magnetic field 
and generally is assumed” to be 1, including in this work. Using the 
known two-carrier expressions in the low-magnetic-field regime” 
(B «1/p)—that is, 0= e(pptp + Nuty) and H= r(p — B’n)/(p + Bn)’e, where 
pand nare hole and electron densities and £ = pty/pp is the mobility 
ratio—we can completely solve the photo-Hall transport problem, for 
example, for a p-type material: 


2o0(rApu- oH) - reAu"p,+ An, [rep, JreAu’p, +40(oH-rAp) (2) 
20(rAyu- oH) 


_ o(1- B)— eAup, (3) 
~— eAp(it B) 

Finally, we obtain 1p = Au/(1- B) and ty = Bup. Note that we need to know 

the background hole density, o, from the dark measurement. Equa- 

tions (1)-(3) are referred to as the Au calculation model. 

The second requirement to enable the CRPH measurement involves 
obtaining aclean Hall signal. Unfortunately, in many photovoltaic films, 
high sample resistance (R > 10 GQ)—as in the case of perovskites—or 
low mobility (u<1cm?V"s") can produce noisy Hall signals. Therefore, 
a.c.-field Hall techniques coupled with Fourier analysis and lock-in 
detection are crucial. We recently developed a high sensitivity a.c.-field 
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calculation of two p-type systems with the same majority mobility (u,) but 
different minority mobility (4) under increasing illumination, yielding 
different conductivity—Hall coefficient (0-H) curves. The slope of the o-H 
curve contains the information of Ay. 


Hall system based ona rotating PDL magnet system’®”’. The PDL 
system is arecently discovered natural magnetic trap that harbours a 
field-confinement effect that generates a magnetic camelback potential 
along its longitudinal axis”. This effect is used to optimize the field 
uniformity (Supplementary Information section A). The PDL Hall sys- 
tem consists of a pair of diametric cylindrical magnets separated bya 
gap. One magnet (the ‘master’) is driven by a motor and another (the 
‘slave’) follows in the opposite direction. This system produces a uni- 
directional and single harmonic field at the centre, where the sample 
resides (Fig. 1b, Supplementary Video 1), which forms the basis for a suc- 
cessful photo-Hall experiment. As well as the photo-Hall measurement, 
optical measurements (for example, transmission and reflectivity) 
to calculate the absorbed photon density (G,) can also be performed 
using the same setup (Fig. 1a, Supplementary Information section C). 

To demonstrate the CPRH technique, two examples are discussed in 
detail: a lead-halide-based perovskite film and a silicon sample, which 
serve as high (An > p,) and low (An < p,) injection cases, respectively. 
The first example uses a (FA,MA)Pb(I,Br), (FA, formamidinium; MA, 
methylammonium) perovskite film, which was fabricated using the 
same method that produced a recent record PCE™, but with further 
optimization of the process. A companion device in the same batch 
yielded a PCE of 20.8% (Methods). The measurement device is a 
six-terminal Hall bar with an active area of 2mm x 4mm anda film thick- 
ness d of 0.55 pumas shown in Fig. 2b. First, we measured the sample in 
the dark and obtained the properties of the majority carrier: p type, 
Do=8.3x10" cm? and [p= 9.8 cm? V's. Next, we performed the meas- 
urements under several laser intensities (up to about 40 mW cm”; 
wavelength A = 638 nm). Examples of longitudinal (R,..) and transverse 
(R,y) magnetoresistance under light are shown in Fig. 2a. The R,, trace 
shows the expected Hall signal with a Fourier component at the same 
frequency (f,,,) as the magnetic field B (Fig. 2a). The desired Hall signal 
R,,is obtained using numerical lock-in detection’’ based ona reference 
sinusoidal signal with the same phase as B (Fig. 2a). Theoand H values 
are then calculated from R,,,and R,,. We also observe a second harmonic 
componentat 2/,.cin the R,, Fourier spectrum, which is also evidentin 
the original R,, trace as a double frequency oscillation (Fig. 2a). This 
component is not the desired Hall signal and thus is rejected. It arises 
from another magnetoresistance effect”, which is stronger in R,.., and 
also appears in R,, because of R,,—R,, mixing due to the finite size of the 
Hall bar contact arms. This highlights the importance of inspecting the 
Fourier spectrum of the Hall signal and of using lock-in detection, as 
opposed to simple amplitude measurement. 

The measurement returns a series of o and H points that change 
substantially upon illumination (Fig. 2b): oincreases by a factor of 
around 340 and Hdecreases by a factor of around 1,400. Our photo-Hall 
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Fig. 2| Carrier-resolved photo-Hall analysis ina high-performance 
perovskite film. a, Magnetoresistance sweep (R,.., longitudinal; R,,,, 
transverse), Fourier transform and lock-in detection of the Hall signal (R,,). 

b, o-H plot for photo-Hall analysis. Inset, the perovskite Hall bar device. 

c, Majority (up) and minority (fy) carrier mobility and photocarrier density An 


equation (equation (1)) provides a simple and quick insight into the 
data by looking at the slope of the o-H data using the log scale. If the 
slope (dInH/d Ino) is equal to -2, then pp= py, if it is larger (less) than -2 
while His positive, then Lp > Ly (Up < Ly). From Fig. 2b we obtain the over- 
all d In H/dIno=~1.36, whichimplies that yz, >. Furthermore, we can 
evaluate Ay, at any o—H point—for example, at the maximum light 
intensity: Ay,,=1.9 cm’V"s_“. We proceed to solve for ip, fly and An using 
the previously discussed Au model and plot the values with respect 
to G, (Fig. 2c). Owing to a considerable change in mobility with light 
intensity inthe perovskite sample, we used the generalized Au model (as 
discussed in Supplementary Information section B.2). Given the varying 
mobilities, this model introduces a correction that yields final mobility 
values as much as two times smaller than the initial mobility values at 
the highest light intensity. We obtain the final solution yz, = (14-28) 
em? V"s ‘and fly = (7-26) cm? V"s7; both up and py increase with G,. 
This increase could arise froma light-modulated intragranular barrier 
effect’. We further obtained An, which increases with G, as expected. For 
comparison, we also plot the ‘single-carrier’ Hall mobility (4,,=0H) and 
Hall density (n,,=1/eH), which have often been used to estimate yz and 
Anin previous photo-Hall studies®. As seen in Fig. 2c, these estimates 
canbe very different from the actual jp, 4 and An values obtained from 
the CRPH measurement. 

In addition to the basic properties of the majority and minority car- 
riers, we then investigated the recombination mechanism in detail by 
plotting An against G,, as shown in Fig. 2c. The data show two power-law 
regimes following An = G,” with m= 1and m= 0.5. The m=1(m=0.5) 
behaviour is expected foramonomolecular recombination (bimolecular 
recombination) regime”; however, inthis case the m= 0.5 regime is more 


plotted against absorbed photon density G,, with n,,,, and 14,, denoting the 
single-carrier Hall density and mobility. The background carrier density pois 
indicated bya grey line. d, Recombination lifetime (z) and diffusion length (Lp) 
mapped against An. All dashed curves are guides for the eye. 


likely explained by trapping. Consider, for example, a single-level trap 
model. The lifetime—for example, for a hole—is given as: t=1/C,n,, where 
C, represents the capture cross-section for a hole and, isthe density of 
the trapped electrons. At very low light intensities, the lifetime is con- 
stant because n, is dominated by the equilibrium (dark) level of charged 
electron traps, n,=1,. As the light intensity increases, the number of 
charged traps increases owing to the increase in injected electrons. 
This in turn reduces rand explains the low exponent (m= 0.5) seen” in 
Fig. 2c. The alternative explanation of bimolecular recombination can 
be discarded because the maximum photocarrier density in our experi- 
ment (An=10“ cm) is around 1,000 times lower than the typical density 
required (An=10” cm”) in order for the bimolecular recombination to 
dominate in perovskite””?8 (Supplementary Information section D). 
The measurementalso provides access to the recombination lifetime, 
t=An/Gand the carrier diffusion length, Lp = ./kg7t/e, where kg is the 
Boltzmann constant and 7 is temperature. G is the photocarrier gen- 
eration rate given as G=7G,, where 7 is the photocarrier generation 
efficiency, which is often assumed to be unity. At high injection level 
when An, Ap > po, itis more appropriate to use an ambipolar diffusion 
length’, which can also be calculated from our CRPH data: 
Lpam= JkeTr(n + p)/e(n/p, + p/p) . Furthermore, we can plot these 
results as a function of An (Fig. 2d). The hole, electron and ambipolar 
diffusion lengths fall very close to each other given similar hole and 
electron mobilities. From this analysis, we obtain values of tof up to 
40 ps and L, values of around 30 um at the lowest light intensity. 
However, these values vary markedly, and decrease to 44 ns and 
1.7 um, respectively, at the highest light intensity. The relatively high 
values of and L, obtained in this study attest to the high quality of this 
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Fig. 3 | Carrier-resolved photo-Hall analysis ina single-crystal p-silicon 
sample. a, Transverse magnetoresistance sweep (R,,), Fourier transform and 
lock-in detection of the Hall signal. b, o—H plot for photo-Hall analysis. The inset 
shows the equivalent plot in the form of o*Hagainst o. c, Majority (up) and 
minority (14y) mobility and photocarrier density An plotted against absorbed 


perovskite film”*. We also compare our results with recent transport 
studies for perovskites (Supplementary Table 3) and obtain general 
agreement. We highlight that, given the large variationint and L, with 
G,, it is crucial to state the values of G, or An when reporting these 
measurements. 

Inthe second example, we investigate a single-crystal silicon sample. 
The sample is a Hall bar made of B-doped, Czochralski-grown silicon 
with active area of 3mm x 3 mmand athickness d of 725 um. This study 
demonstrates CRPH measurement of a well-known material, inthe low 
injection regime and with a large thickness (d > 1/a and Lp). We used a 
laser with a wavelength of 638 nm and anintensity of up to50 mWcm~. 
First, we obtain the o-H curve that begins with a positive H value in the 
dark, indicating a p-type material with py = 6.6 x 10” cm? (Fig. 3b). At 
higher light intensity, Hbecomes negative, indicating increasing electron 
(minority) carrier conductivity. For convenient extraction of Ay, we plot 
o’Hagainst o (Fig. 3b, inset), whichis more appropriate for low-injection 
analysis. The data shows a monotonic behaviour with nearly constant 
slope at a high-intensity regime, which yields Ay,,=-1,070 cm? V1s7, 

We then calculated yz, 1, and An using equations (2) and (3), with 
the results shown in Fig. 3c, d. We obtain an average majority mobility 
of tp =486 cm? V's", anda minority mobility of py =1,560 cm’ V"s7; 
these values are in good agreement with the hole (around 500 cm?V7s*) 
and electron (around 1,500 cm? V's‘) mobilities in silicon”°. These 
values are sufficiently constant as a function of light intensity that we 
do not need to attempt the mobility-variation correction using the 
generalized model as for the perovskite analysis. We also plotted An 
against G, and obtained a curve that follows An = G,” with m= 1.2; this 
suggests amonomolecular recombination regime, as expected for sili- 
con”. At the highest light intensity, we obtain t= 2 ps andl, y~90 pm. 
For comparison, we also measured the lifetime using a quasi-steady- 
state photoconductance technique”; this yields t values of 1-5 pts for 
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photon density G,. n,,,, is the single-carrier Hall density. d, Recombination 
lifetime and minority carrier diffusion length plotted against An. tvalues 
measured using the quasi-steady-state photoconductance decay (QSS-PCD) 
technique are shownas black circles. All dashed curves are guides for the eye. 


An=2x10"-10" cm, inclose agreement with the CRPH result (Fig. 3d, 
Methods). As an additional example of CRPH measurement in the low- 
injection regime we studied the material kesterite (Cu,ZnSn(S,Se),), 
which is also of high interest for photovoltaics applications (Supple- 
mentary Information sections E, G). 

In contrast to the classic Hall effect, which only yields three param- 
eters, the CRPH technique yields 7N parameters: up, Hy, An, T, Loy, Lpp 
andLpamrepeated at Nlight intensities. Additionally, it is also possible 
to calculate the relevant recombination coefficient; for example, k,=1/T 
inthe monomolecular recombination regime. Of the many electrical 
transport measurements performed on perovskites (as summarized 
in Supplementary Table 3), this is the first time to our knowledge that 
all minority and majority carrier characteristics have been determined 
simultaneously from a single experimental setup, ona single sample 
and mapped against varying light intensities under steady-state condi- 
tions. This demonstrates the power of the CRPH technique and repre- 
sents a considerable expansion of the original Hall effect measurement’. 
The approach should also provide a valuable means of investigating 
the charge-carrier parameters of a wide range of conventional and 
emerging semiconductors for solar cells and broader applications. 
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Methods 


Photo-Hall measurement 

The experimental setup is shown in Fig. 1a. All measurements were 
performed at room temperature. Photoexcitation was achieved using 
a solid-state red laser (A = 638 nm, maximum power 190 mW) for the 
perovskite and silicon, or a blue laser (A = 450 nm, maximum power 
500 mW) for kesterite (Supplementary Information sections E, G). 
The sample is centred between the PDL magnets and the laser beam 
is directed through a motorized neutral density filter. A cylindrical 
lens is used to expand the beam, while a wedge lens deflects the beam 
ontothe sample area. A beam splitter is used to split the beam towards 
a‘monitor’ silicon photodetector (PD) to measure the monitor photo- 
current (/pp.mon) at every light intensity. The /pp.mon iS used to determine 
the incident photon flux (®) and the absorbed photon density (G,) on 
the sample, which is given as G, = (1- R) [1 - exp(-ad)]/d, where R is 
the reflectivity, ais the absorption coefficient and dis the thickness 
(Supplementary Information section C). The optical properties of the 
films studied in this work are presented in Supplementary Table 2. 

The details of the PDL Hall system are described in Supplementary 
Information section A. The electronic instrumentation consists of a 
custom-built PDL motor control box, Keithley 2450 source meter unit 
(SMU) to apply the voltage or current source to the sample, Keithley 
2001 digital multimeter (DMM) for voltage measurement, Keithley 
7065 Hall switch matrix card with high impedance buffer amplifiers for 
routeing the signals between the samples, SMU and DMM. For samples 
with relatively high mobility (perovskite and silicon) we use the d.c. 
current excitation mode, and for low mobility samples (for example, 
kesterite) we use a.c. current excitation mode with SRS830 lock-in 
amplifier to achieve better noise rejection. The PD current is measured 
using a Keithley 617 electrometer. 

Atevery light intensity, the sheet resistance (R,) is obtained by meas- 
uring two states and eight states of longitudinal magnetoresistance 
(R,,.) for six-terminal Hall bar and four-terminal van der Pauw samples, 
respectively. For the Hall bar sample, the sheet resistance is given as 
R,=R,.W/L, where Wis the width and L is the length of the Hall bar 
active area. The conductivity of the sample is given as: o=1/R,d. Next, 
the transverse magnetoresistance (R,,) is measured. The PDL master 
magnet is rotated by a stepper motor system, typically with a speed 
of 1-2 r.p.m., to generate an a.c. field. A typical magnetic field ampli- 
tude on the sample is around 0.70 T for a PDL magnet gap of around 
10 mm. A Hall sensor is placed under the master magnet to monitor the 
oscillating field. The field and R,, are recorded as a function of time for 
15-30 min each sweep. This measurement is repeated at several light 
intensities ranging from dark to the brightest level, while recording the 
‘monitor’ PD current to determine G,. After all of the measurements are 
completed, the sample is replaced with a ‘reference’ photodetector 
(PD). We then determine the photocurrent ratio, Kpp = |pp-ret/lpp-mony 
between the reference PD and the monitor PD at every given light 
intensity (see Supplementary Information section C for further detail). 

Hall signal analysis is performed using a custom program developed 
in MATLAB~. Fourier spectral analysis is used to determine the exist- 
ence of the magnetoresistance signal (R,,) at the same frequency as 
the magnetic field. We then proceed with phase sensitive lock-in detec- 
tion, implemented by software, to extract the in-phase component 
of the Hall signal (R,,) while rejecting the out-of-phase component 
that arises from various sources, such as due to Faraday induction 
of an electromotive force. We use a typical lock-in time constant of 
120-300s. The Hall coefficient is calculated using H = R,,d/By, where By 
is the magnetic field amplitude. Therefore, at every light intensity, we 
obtain a set of oand Hvalues and proceed with the photo-Hall analysis. 
We also provide additional CRPH studies for MAPbI, perovskite and a 
kesterite sample in Supplementary Information sections E, G. For the 
alternative lifetime measurement in silicon using quasi-steady-state 
photoconductance technique we used” Sinton Instruments WCT-120. 


The measurement was performed on the 6” silicon wafer from which 
the Hall sample originated. 

We also make a general remark about the CRPH analysis. First, in 
the very low light intensity (or low injection) regime—for example, 
G,<5x10” cm” s for the silicon sample in Fig. 3c—the CRPH analysis 
results become inaccurate, the An values are scattered higher than 
expected from the monomolecular recombination trend. From our 
experience, in analysing many samples so far (for example, perovskite, 
Si, kesterite), we expect that this could be due to the accuracy limitation 
of the oand Hmeasurements. For very lowlight intensity, typically we 
cannot resolve An values smaller than 1% of po, as shown in Fig. 3c. The 
best analysis results come from the higher intensity regime, in which 
there is large An (An > p,/100) such that substantial changes in oandH 
are noted, as in the perovskite example. Second, ifthe mobility values 
of the systems are very low, such as in the case of kesterite samples 
(Up =1cm’ V's ‘and pty =10 cm’ V's"), the Hall coefficient measure- 
ment becomes noisier and this also affects the accuracy of the analysis. 


Perovskite solar cell 

The perovskite films for photo-Hall study are based on the (FAPDbI,),_. 
(MAPbBr;),,mixed-perovskite system, and use a halide perovskite com- 
position analogous to that used ina previous device with a PCE of 17.9% 
at x= 0.15 as reported in ref. *. The PCE was subsequently improved 
by modifying the film deposition method and adjusting the value of x. 
We demonstrated a high average PCE of 20.8% at x= 0.12 fora (FAPDI,) og 
(MAPbBr;)o,. device (FTO/bI-TiO,/mp-TiO,/perovskite/PTAA/Au) under 
1sun conditions (AM 1.5G spectrum, 100 mW cm”). Extended Data 
Fig. 1a shows photocurrent density-voltage (/- V) curves for the (FAPb 
15)o.sg(MAPbBr;), , device measured by reverse and forward scans with 
10-mV voltage steps and 40-ms delay times under AM 1.5G illumination. 
The device exhibits a short-circuit current density (/s<) of 23.3 mAcm”, 
open circuit voltage (V,,) of 1.13 V, and fill factor (FF) of 80.0% by reverse 
scan. A slightly decreased FF (to 77.8%) by forward scan results in an 
average PCE of 20.8%. An external quantum efficiency (EQE) spectrum 
for the device is shown in Extended Data Fig. 1b, demonstrating a very 
broad plateau of over 80% between 400 nm and 750 nm. The histogram 
of PCE values for 80 cells is shown in f Fig. Ic. 

The/-V curves were measured using a solar simulator (Newport, Oriel 
Class A, 91195A) with simulated AM1.5G illumination at 1OO mW cm” 
and acalibrated Si-reference cell certificated by the National Renewable 
Energy Laboratory. The system uses a Keithley 2420 source meter for 
/-V measurement. The measurement was performed at 25 °C under 
ambient conditions. The devices were pre-illuminated for 30 s under 
1sun and the measurement was performed in the reverse (from 1.5 V 
to -0.2 V) and the forward (from —0.2 V to 1.5 V) scanning directions. 
The current density-voltage (/-V) curves for the perovskite devices 
were measured by masking the active area (0.16 cm? measured using 
an optical microscope) witha metal mask of 0.094 cm’ in area. The EQE 
was measured by a power source (Newport 300W Xenon lamp, 66920) 
with amonochromator (Newport Cornerstone 260) anda multimeter 
(Keithley 2001). 


Fabrication of perovskite solar cells 

A70-nm-thick blocking layer of TiO, (bI-TiO,) was deposited onto an 
F-doped SnO, (FTO, Pilkington, TEC8) substrate by spray pyrolysis 
using a 10 vol% titanium diisopropoxidebis (acetylacetonate) solu- 
tion in ethanol at 450 °C. A TiO, slurry was prepared by diluting TiO, 
pastes (Share Chem, SC-HT040) in mixed solvent (2-methoxyethan 
ol:terpineol = 3.5:1 w/w). The 100-nm-thick mesoporous-TiO, (mp- 
TiO,) was fabricated by spin coating the TiO, slurry onto the bI-TiO, 
layer and subsequently calcining at 500 °C for 1h in air to remove the 
organic components. Bis(trifluoromethane) sulfonimide lithium salt 
was treated onto the mp-TiO, layer. Then, the (FAPbI,), gg(MAPDbBr;) 15 
film was formed using the method described in the section ‘Fabrica- 
tion of perovskite Hall samples’. A polytriarylamine (PTAA) (EM index, 


M,,=17,500 g mol”)/toluene (10 mg/1 ml) solution with an additive of 
7.5 ul Li-bis (trifluoromethanesulfonyl)imide (Li-TFSI)/acetonitrile 
(170 mg/1 ml) and 4 pl 4-tert-butylpyridine (TBP) was spin-coated on 
the perovskite layer/mp-TiO, /bI-TiO, /FTO substrate at 3,000 r.p.m. 
for 30s. 


Fabrication of perovskite Hall samples 

All precursor materials were prepared following previous report”*. To 
form the perovskite thin film based on the (FAPbI,),..4(MAPbBr;) 91 
absorber, the 1.05 M solution dissolving NH,CH = NH,I(FAI) and 
CH;NH,Br(MABr) with Pbl, and PbBr, in V,N-dimethyl formamide (DMF) 
and dimethyl sulfoxide (DMSO) (6:1 v/v) was prepared by stirring at 
60 °C for 1h. Then the solution was coated onto a fused silica substrate 
heated to 60 °C by two consecutive spin-coating steps, at 1,000 and 
5,000 r.p.m., for 5s and 10s, respectively. During the second spin- 
coating step, 1 ml ethyl ether was poured onto the substrate after 5 s. 
Then, the substrate was heat-treated at 150 °C for 10 min. The compact 
(FAPDbI,)o.gg(MAPbBr;) 9) film with a thickness of 550 nm was obtained. 
Then, we selectively scraped the film off the substrate to pattern with 
the desired Hall bar configuration for photo-Hall measurement. The 
Hall bar is a six-terminal device as shown in Fig. 2b, inset. We deposited 
an Au metal contact pattern (100-nm thick) and installed a header pin 
to mount the sample to the PDL Hall tool. 


Data availability 


The datasets generated and analysed during the current study are avail- 
able from the corresponding author on reasonable request. 
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Although copper oxide high-temperature superconductors constitute a complex 
and diverse material family, they all share a layered lattice structure. This curious fact 
prompts the question of whether high-temperature superconductivity can exist 

in an isolated monolayer of copper oxide, and if so, whether the two-dimensional 
superconductivity and various related phenomena differ from those of their 
three-dimensional counterparts. The answers may provide insights into the role of 
dimensionality in high-temperature superconductivity. Here we develop a 
fabrication process that obtains intrinsic monolayer crystals of the high- 
temperature superconductor Bi,Sr,CaCu,Og,, (Bi-2212; here, a monolayer refers toa 
half unit cell that contains two CuO, planes). The highest superconducting transition 
temperature of the monolayer is as high as that of optimally doped bulk. The lack of 
dimensionality effect on the transition temperature defies expectations from the 
Mermin-Wagner theorem, in contrast to the much-reduced transition temperature 
in conventional two-dimensional superconductors such as NbSe,. The properties of 
monolayer Bi-2212 become extremely tunable; our survey of superconductivity, the 


pseudogap, charge order and the Mott state at various doping concentrations 
reveals that the phases are indistinguishable from those in the bulk. Monolayer Bi- 
2212 therefore displays all the fundamental physics of high-temperature 
superconductivity. Our results establish monolayer copper oxides as a platform for 
studying high-temperature superconductivity and other strongly correlated 
phenomena in two dimensions. 


Insystems with reduced dimensions, long-range order (Superconduc- 
tivity in particular) is strongly suppressed’, as in the case of conven- 
tional Bardeen-Cooper-Schrieffer-type superconductors*’, and yet 
all high-temperature copper oxide superconductors have a layered 
structure with varying degrees of anisotropy. This apparent dichotomy 
may be the key to high-temperature superconductivity (HTS)>°, and 
it raises the question of whether HTS and various correlated phenom- 
ena associated with it are different in two dimensions. This question 
is important for two reasons. First, most HTS theories are based on 
purely two-dimensional (2D) models” ”, whereas experiments show 
that supercurrent phase coherence™, charge ordering» and charge 
dynamics” all have a3D nature”. Second, much of what we know about 
HTS came from experimental tools such as scanning tunnelling micros- 
copy/spectroscopy (STM/STS) and angle-resolved photoemission 
spectroscopy (ARPES) that probe the surface of the materials!* °°; HTS 
as a bulk property was inferred from the surface measurements. The 
bulk-surface correspondence becomes ideal if the HTS is truly 2D. 
To resolve these issues experimentally, an isolated monolayer high- 
temperature superconductor is needed. Such an atomically thin crystal 


would represent an ideal correlated 2D system for exploring quantum 
phenomena in reduced dimensions. 

Monolayer HTS has previously been studied mostly in epitaxial oxide 
heterostructures” °°, where the active layers are buried between inter- 
faces. Such systems are not accessible to spectroscopic tools such as 
STM/STS and ARPES. In recent years, an alternative, top-down approach 
has emerged: it has become possible to mechanically exfoliate mon- 
olayer atomic crystals (termed ‘2D materials’) from the layered bulk***. 
High-quality 2D materials ranging from insulators to metals and super- 
conductors* have been produced this way. 

Experimentally extracting monolayers from bulk high-temperature 
superconductors, however, turned out to be extremely challenging. 
Although many of the bulk high-temperature superconductors are 
considered stable under ambient conditions, they are highly prone 
to chemical degradation when thinned to monolayers. Indeed, mon- 
olayer Bi-2212 has been found to be insulating*!* or superconducting 
with a much reduced transition temperature (T,)**. The suppression 
is seemingly consistent with increased fluctuations expected in 2D 
superconductors. But given that the material is extremely sensitive 
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Fig. 1| Fabrication and characterization of atomically thin Bi-2212 transport 
devices. a, Atomic structure of Bi-2212. ‘Monolayer’ refers toa half unit cell in 
the out-of-plane direction that contains two CuO, planes. The monolayers are 
separated by van der Waals gaps in bulk Bi-2212. b, Optical image of a typical Bi- 
2212 thin flake exfoliated on Si wafer covered with 285-nm-thick SiO). Scale bar, 
30 um. c, Atomic force microscopy (AFM) image of the same flake shown inb 
(region marked by the black square). L, layer. Scale bar, 10 pm. d, Cross-sectional 
profile of optical contrast along the red line in b, in comparison with the cross- 


to environment and to doping variations, all extrinsic factors must 
be eliminated before ascribing the reduction of 7. in monolayers to 
the effect of dimensionality. The outstanding challenge has been to 
fabricate high-quality monolayer crystals and probe their intrinsic 
electronic structure. 

Here we overcome these challenges by developing sample fabrication 
processes that preserve the intrinsic properties of monolayer Bi-2212. 
We first pinpoint two main causes of sample degradation—reaction 
with water vapour and rapid loss of oxygen dopant. We find that the 
degradation slows down ina cold, inert environment, in which pristine 
monolayer Bi-2212 can be obtained. Unlike the bulk crystal, the mon- 
olayer Bi-2212 is extremely tunable: we can continuously vary its doping 
level in situ and map out major phases from the over-doped regime to 
the Mott insulating regime, ina single monolayer device. We find that 
the highest 7, of the monolayer is as high as that of optimally doped 
bulk. Moreover, STM/STS study reveals that the monolayer develops the 
same rich set of phases—HTS, pseudogap, charge order and Mott 
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sectional profile of AFM topography at the same location (blue line inc). 

The quantized steps in contrast and height profiles correspond to monolayer 
terraces of Bi-2212.e, Optical image ofa monolayer Bi-2212 device. The bulk flake 
in contact with the monolayer is cut into separate pieces, which serve as 
electrical leads for transport measurements. Scale bar, 100 pm. f, Typical 
temperature-dependent resistance of a monolayer Bi-2212 sample (red) in 
comparison with that of an optimally doped bulk crystal (blue). Resistances are 
normalized by their values at 7=200K. 


insulating phase, in particular—that were observed on the bulk surface. 
Detailed characterization of the phases reveals that they are indistin- 
guishable from those inthe bulk. Amonolayer, therefore, contains all the 
essential physics of Bi-2212: that is, HTS in Bi-2212 is essentially a 2D 
phenomenon. 


Fabricating pristine monolayer Bi-2212 

We start with bulk Bi-2212 with a slightly modified stoichiometry, 
Bi, Sr,,CaCu,Og,5, which has a highest T, of 88 K at optimal doping. Ina 
monolayer Bi-2212, two CuO, planes—separated by a Ca layer—are sand- 
wiched between SrO and BiO planes to forma charge-neutral, septuple- 
layered slab as shown in Fig. 1a. The parent compound of Bi-2212 is an 
antiferromagnetic Mott insulator*. Doping holes into the CuO, planes 
generates a pseudogap phase that is characterized by strong depletion 
of density of states (DOS) near the Fermi level’*°. As the doping level 
p (holes per CuO, plaquette) increases, the pseudogap phase evolves 
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Fig. 2| Tunable high-temperature superconductivity in monolayer Bi-2212. 
a, Temperature-dependent resistivity R(p, T )ofa monolayer Bi-2212 (sample A) 
that is initially over-doped. Data were acquired between annealing cycles that 
progressively lower the doping level of the sample (from purple to red). 

b, Conductivity plotted asa function of temperature and doping level. Doping 
level pis determined from p=217 Q/R_-(T=200 K). Black circles denote the 
onset of the pseudogap state at 7*. Here the vertical error bars represent 
uncertainties in locating 7* at which the temperature-dependent resistance 


into a superconducting phase with highest 7, reaching 91 K at an opti- 
mal doping level of p = 0.16 (ref. *°). Oxygen doping is therefore a key 
variable that determines the electronic structure in Bi-2212. Because 
the van der Waals interaction between the layers is weak, atomically 
thin Bi-2212 flakes can be obtained through mechanical exfoliation 
onan oxygen-plasma-treated SiO, surface’’. Figure 1b and e displays opti- 
cal images of few-layer Bi-2212 in which the monolayer region is as large 
as several hundreds of micrometres in diameter (the number of layers is 
identified from the optical contrast, which correlates well with the thick- 
ness of the crystals determined from atomic force microscopy; Fig. 1d). 

The exfoliated monolayer Bi-2212 is extremely sensitive to its environ- 
ment. We find that the monolayers are insulating ifthe specimen is pre- 
pared under ambient conditions, consistent with previous reports’. 
Asystematic investigation (see Extended Data Table 1 and Extended 
Data Fig. 1) reveals that exposing the monolayers to air, albeit briefly, 
renders them insulating. Guided by the investigation, we succeeded 
in obtaining high-quality, intrinsic monolayer Bi-2212 by fabricating 
samples onacold stage kept at —40 °C inside an Ar-filled glove box with 
water and oxygen content below 0.1 ppm. Finally, we make electrical 
contacts to the monolayer flakes by cold-welding indium/gold micro- 
electrodes (See Methods and Extended Data Table 1) ontop. The flakes 
are then cutinto an appropriate geometry witha sharp tip (Fig. le), and 
quickly transferred into an evacuated sample chamber for subsequent 
transport measurements. We have also obtained monolayer Bi-2212 of 
similar quality at low temperatures under ultra-high vacuum (UHV) for 
separate STM/STS study; details of the sample fabrication procedure 
are provided in the Methods. 

Figure 1f shows the normalized resistance of a monolayer in compari- 
son with that of optimally doped bulk Bi-2212. The monolayer retains 
HTS, and the sharp superconductivity transition signifies the high 
quality of the sample. More surprisingly, the 7, of the monolayer is 
almost as high as the optimal 7, in the bulk, indicating that HTS in 2D 
monolayer Bi-2212 does not differ appreciably from that in 3D bulk. This 
is corroborated by an accurate quantitative comparison of monolayer 
and bulk T,, which we discuss below. 
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deviates from linear behaviour. White circles mark the superconducting 
transition temperature 7.. The phase diagram spans the optimal doping at 
which T, reaches its maximum value7?™. c, 7?” obtained from different 
monolayer Bi-2212 samples (an example is shown in b), incomparison with 7, in 
optimally doped bulk crystals. The highest 7 7 represents the maximum 7, of 
the most intrinsic monolayer in our experiment, and its value lies within the 
uncertainty range of the 7, in optimally doped bulk. 


Tunable high-temperature superconductivity 


The reduction in dimensionality produces a key advantage: the HTS in 
monolayer Bi-2212 becomes extremely tunable. The tunability stems 
from the fact that both sides of the monolayer are exposed, making it 
easy for interstitial oxygen to escape from or enter the crystal. Specifi- 
cally, we find that mild vacuum annealing at temperatures between 
300 K and 380 K drives oxygen out of the monolayer. Meanwhile, anneal- 
ing at about 200 K in ozone (partial pressure approximately 0.5 mbar) 
increases the oxygen concentration (Extended Data Fig. 2). These 
findings enable us to continuously vary the doping level and track the 
evolution of various phases, including superconductivity, from an 
over-doped to deeply under-doped regime (and vice versa) ina single 
monolayer sample. Figure 2a displays a set of measurements of 
temperature-dependent resistivity, R(T), of a monolayer Bi-2212 
(sample A), acquired between annealing treatments at 300-380 K in 
vacuum (base pressure <10* mbar). The annealing treatments progres- 
sively lower the hole doping level in the monolayer and induce atran- 
sition from superconducting to insulating behaviour. Meanwhile, the 
room-temperature resistivity increases by one order of magnitude 
from about 1kQ to about 30 kQ. Details of the transition become more 
apparent when the resistivity of the same sample (normalized to its 
value at T= 200 K) is plotted as a function of temperature and hole 
doping level p, as shown in Fig. 2b. (Here the hole doping level is deter- 
mined from p = const./R-(7 = 200 K} the value of the constant (const.) 
is chosen so that p = 0.16 at optimal doping*®”’, and the precise value 
of p does not affect our conclusions.) As p decreases, T, (defined as the 
temperature at which d?R_,/dT? = 0; see Extended Data Fig. 3) rises at 
first, then falls continuously, giving rise to a superconducting dome 
that ends at p = 0.022. An insulating phase appears next to the 
superconducting dome. In addition, we observe at T* > 7, the onset of 
the pseudogap phase that is marked by deviation from a linear 
R_(T)inthe normal state of a high-temperature superconductor under 
various doping levels (open black circles in Fig. 2b; see Extended Data 
Fig. 3 for detailed analysis). Figure 2b, therefore, maps out a phase 
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Fig. 3 | Tunnelling spectroscopy of monolayer Bi-2212. a, Schematic 
illustration of the STM measurement set-up. Monolayer Bi-2212 (and the bulk 
crystal nearby) is electrically connected to a pre-patterned Au electrode, 
which provides a returning path for the tunnelling current. b, High-resolution 
STM topograph of a monolayer Bi-2212. The image was taken at ajunction 
resistance of 2GO anda sample bias voltage of -300 mV. Inset: magnified view 
of top Bi atoms and supermodulation ridges. c, Fourier transform ofthe STM 
topograph in b. Peaks are clearly visible at multiples of supermodulation 
wavevector qsy- Bragg peaks of the atomic lattice are marked by broken 
circles. d, Line cut of the Fourier transform inc along the [1,1] direction. 
Monolayer curve (1L, blue) is compared with bilayer (2L, red) and bulk (black) 
data; peaks at qgy and 2q¢y align within an uncertainty of 1.5% of 211/do. 


diagram of the monolayer that is strikingly similar to that of bulk cop- 
per oxides®. 

Close examination of the phase diagram in Fig. 2b provides further 
insights into the 2D HTS in monolayer Bi-2212. We focus on the high 7, 
that characterizes the superconducting transition in the monolayer. 
Specifically, we use the phase diagram to accurately determine how 
much, if at all, 7.is suppressed in the monolayer compared with in the 
bulk. Because 7, strongly depends on hole doping level, a comparison 
is valid only when it is made at the same doping level. The maximum 7, 
at optimal doping, 77, therefore serves as a natural metric for such 
comparison, given that varying the sample thickness does not alter the 
optimal doping level itself. Figure 2c summarizes the measured 7?" of 
monolayer Bi-2212 in comparison with the 7, of optimally doped bulk 
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e, Spatially averaged differential conductance spectra acquired on monolayer, 
bilayer and bulk Bi-2212 at near optimal doping. Broken lines mark the position 
of coherence peaks. The horizontal bars mark the zero of each curve. 

f, Spatially averaged spectral temperature dependence onanearly 

optimal doped monolayer Bi-2212 showing a smooth transition from the 
superconducting to pseudogap state at T= 85 K. g, Evolution of spatially 
averaged tunnelling spectra of monolayer Bi-2212 with diminishing doping 
level p. Here the doping level is characterized by A, (p). The energies A, (black 
arrows) are extracted at the pseudogap edge, and the energies A, (vertical 
bars) are identified as the ‘kink’ energy”’. Curves are offset for clarity, and 
horizontal bars mark the zero of each curve. 


crystals. (Here 72"*“of monolayers was extracted from phase diagrams, 
exemplified in Fig. 2b, and we ensured that the superconducting domes 
of all monolayer samples spanned the optimal doping so that T?"™ 
could be reliably determined; 7" determined by different methods 
is shown in Extended Data Fig. 3.) Both datasets exhibit appreciable 
spread that most likely reflects variations in the impurity level in dif- 
ferent specimens. More importantly, the highest 77 of 88.1 K that 
represents the most intrinsic monolayer is within the uncertainty of 
optimal bulk 7,. The difference of about 2% between the average of T"™ 
inthe monolayer and the average of optimal 7. in bulk may be explained 
by inevitable slight sample degradation from our fabrication process. 
Our observations therefore reveal a robust 2D HTS in monolayer Bi-2212 
with optimal transition temperature as high as that in 3D bulk. 
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Fig. 4| Quasi-particle interference and superconducting gap in monolayer 
Bi-2212.a, Representative conductance ratio map Z(r, F) obtained at F=20 meV 
onthe sameareaas in Fig. 3b. b, Illustration of the octet model for Bogoliubov 
quasiparticle interference in Bi-2212 at a given energy. The octet ends of four 
banana-shaped constant-energy contours have maximum density of states. 
Quasi-particle scattering between these eight regions produces seven primary 
scattering q-vectors, q, to q,, labelled by coloured squares. c-h, Fourier 
transform of the conductance ratio map |Z (q, F)|. The Fourier transforms are 
mirror-symmetrized and normalized to their average value. Fis labelled on 
each panel. In particular, f displays the Fourier transform of the conductance 
ratio map ina. Red solid lines indicate the atomic Bragg vectors at (2T1/do, 0) 
and (0, 211/dg). Of the total of seven independent scattering vectors (coloured 


Monolayer topography and tunnelling spectroscopy 


STM topography measurement (schematic set-up shown in Fig. 3a) 
confirms the high quality of monolayer Bi-2212, which retains the 
original atomic structure found inthe bulk crystals. Figure 3b displays 
the atom-resolved topography of the top BiO plane of a Bi-2212 mon- 
olayer. The surfaces are as clean as the bulk surface and are continuous 
over macroscopic distances (about 100 pm; Extended Data Fig. 6). 
Nearly commensurate supermodulation ridges along the [110] direc- 
tion—a distinctive feature in Bi-based bulk copper oxides'*—are clearly 
observed. Fourier transform of the topography images reveals that 
the period of the supermodulation q.,, exactly matches that on the 
bulk surface (Fig. 3c, d); no additional surface reconstructions were 
detected. Despite the identical atomic structure, monolayer Bi-2212 
does exhibit a feature not seen onthe bulk surface: large scale corruga- 
tions with a root-mean-square (r.m.s.) value of 0.2 nm, in contrast to 
the flat surface of the bulk crystal. We attribute the corrugations to the 
underlying substrate: few-layer Bi-2212 may become flexible and 
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squares) prescribed by the octet model illustrated in b, five are observed as 
peaks in the Fourier transform; q, and q; are too weak to be detected. i, Loci of 
the ends of banana-shaped constant-energy contours extracted from dispersion 
of the q-vectors. Locations of the loci represent the underlying Fermi surface. 
Solid line is a fit to the data witha circular arc joined with two straight lines. 
Broken line marks the antiferromagnetic zone boundary.j, Superconducting 
gap A,- as a function of Fermi surface angle 6,. Acc is extracted from the 
measured position of scattering vectors q, to q, (excluding q, and q;) following 
the procedure described in refs. **”°. Solid line is a fit to the data with d-wave 
gap function A(6,) = Agp,[A cos(26,) + (1- A)cos(66,)], where Agp, = 47.3 meV and 
A=0.844 are fitting parameters. 


partially conform to the rough surface of amorphous SiO, (r.m.s. 
approximately 0.25 nm). 

We now turn to the electronic structure of monolayer Bi-2212. We 
note that a variety of spectroscopy studies revealed a rich set of phases 
that are characterized by two energy scales, referred to as Ayand A,, in 
bulk Bi-2212 (refs. 7°”). Specifically, excitations in the superconducting 
state occur at energies F < Ag, whereas charge-order and other highly 
correlated broken-symmetry states appear at pseudogap energy scale 
E = A,; the competition or cooperation between these intertwined 
phases remains one of the central problems of HTS (refs. *). In the 
following, we examine these strongly correlated states in monolayer 
Bi-2212. 

Figure 3e displays the differential conductance spectra g(F), which 
is proportional to the DOS at energy F, of monolayer and bilayer samples 
cleaved from a nearly optimally doped bulk crystal with 7, = 88 K 
(referred to as OP88). Here the spectra are spatial averages of the local 
differential conductance spectra g(r, F= eV) =d//dV|,, , over a500 A 
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Fig. 5 | Electronic inhomogeneity and charge-ordered state in monolayer 
Bi-2212.a-c, Gap maps A, (r) obtained on monolayer Bi-2212. The monolayers 
are obtained from bulk crystals UD50 (under-doped, T.=50 K), OP88 (optimally 
doped, T.=88 K) and OD55 (over-doped, 7.=55K). Field of view, 400Ax 400A. 
A, denotes the average value of A, over the entire field of view. A, (r) ina was 
determined from fitting each local tunnelling spectrum using the method 
described in ref. *°. Values of A, (r) in band c were extracted as the energy 
separation between two coherence peaks in each local tunnelling spectrum. 

d, Histograms of A, (r) shown in a-c normalized by their mean value. The 
normalized gap distributions in monolayers are highly similar to those of bulk 
source crystals (Extended Data Fig. 9).e-g, Conductance maps 
g(r,£)=d//dV(r, £) recorded at F=20 meV onthe sameareas shown ina-c. 

h-j, Fourier transforms of g(r, F=20 meV)ine-g. Charge-order peaks are clearly 
resolved at q=(+0.25, 0)27/dg and (0, +0.25)27/dag (marked by broken circles) 
in under-doped monolayer. Red crosses mark lattice wavevectors at (+21t/do, 0) 
and (0, +27/do). 


x 500 A field of view; /and Vare tunnelling current and sample-bias 
voltage, respectively, and eis the charge of an electron. The V-shaped 
superconducting energy gap and the large coherence peaks on both 
sides of the gap are clearly observed inthe spectra. The size of the gap, 
defined as half the separation between two coherence peaks, Ao, inthe 
monolayer and bilayer is almost identical to that in the bulk (Fig. 3e, 
black curve) from which the monolayer and bilayer were obtained. Close 
examination reveals that the monolayer and bilayer spectra also faith- 
fully reproduce the fine details, the dip-hump structure outside of the 
gap andthe electron-hole asymmetric background in particular, that 
are found in the bulk spectrum”. Differential conductance spectra at 
elevated temperatures show that the pseudogap state, too, persists in 
monolayer Bi-2212. The pseudogap state manifests as a gap ing(F) well 
above the 7, of the bulk source crystal (Fig. 3f). Finally, we note that A, 
coincides with A,in the nearly optimally doped monolayer. On lowering 


the doping level, however, the two energy scales diverge: A, moves to 
higher energies, whereas A) becomes smaller (Fig. 3g), consistent with 
the behaviour in bulk copper oxide superconductors'**, The close 
match between the monolayer and bulk spectra is the first indication 
that the superconducting state (and electronic structures associated 
with it) remains intact in the 2D limit. 


Quasi-particle interference and superconducting gap 


The low-energy excitations inside the superconducting energy gap 
carry crucial information onthe superconducting state. The excitations, 
also known as Bogoliubov quasiparticles, scatter off impurities and 
produce interference patterns that can be detected by spatial mapping 
of the tunnelling conductance in g(r, eV) at a given bias Von the bulk 
Bi-2212 surface”. Further, the Fourier transform of the interference 
patterns reveals maxima at a set of energy-dependent wavevectors 
q, (i=1,..., 7)-aresult of elastic scattering between the eight high joint- 
density-of-state loci of the ‘banana-shaped’ constant energy contour 
of Bogoliubov quasiparticles” (referred to as the ‘octet model’; Fig. 4b). 
The quasi-particle interference has therefore been a powerful tool for 
reconstructing the superconducting gap dispersion A(k) of copper 
oxide superconductors'*”*, 

We used the quasi-particle interference technique to probe A(k) in 
monolayer Bi-2212. We focus on the conductance ratio map 
Z(r, F=eV) =g(r, + eV)/g(r, — eV), which eliminates systematic errors 
related to the tunnelling setpoint associated with directly mapping the 
conductance g(r, eV) (ref. 7°). Figure 4a displays an example of the 
conductance ratio map of monolayer Bi-2212 obtained at F = 20 meV. 
The Fourier transform of the conductance ratio map,|Z(q, F = eV)|shows 
clear maximaat q,that are fully consistent with the octet model, except 
that peaks at q, and q; are too weak to be detected (Fig. 4f). As the tun- 
nelling bias Vis varied, we observe that the measured q; disperse with 
energy F=eV, and the dispersions q,(F) are again consistent with those 
expected from the octet model (Extended Data Figs. 7 and 8). The dis- 
persions q,(F) allow us to extract the energy-dependent locations of the 
octet ends of the ‘bananas’ ink space, and the obtained loci can be inter- 
preted as the normal-state Fermi surface”. Our result, shown in Fig. 4i, 
is consistent with a cylindrical Fermi surface centred at (11,11) that is 
observed in bulk Bi-2212 and various other bulk copper oxide 
superconductors”*. Finally, we determine the superconducting gap 
dispersion A(k) from q,(£). Figure 4j displays the measured supercon- 
ducting gap energy of the monolayer as a function of 6, along the Fermi 
surface. The data agree with the d-wave superconducting gap disper- 
sion of bulk Bi-2212 at similar doping level”?. We therefore conclude 
that reducing the material’s dimensions from three to two does not 
fundamentally alter the superconducting gap structure. 


Electronic inhomogeneity and charge-ordered state 


Next, we focus onthe electronic structure of monolayer Bi-2212 beyond 
the superconducting energy gap Ag. In particular, the energy scale A, 
is associated with the anti-nodal pseudogap and other correlated states 
that are intricately linked to superconductivity”. In contrast to the 
relatively homogeneous superconducting gap Ao, the pseudogap A, 
varies widely at the nanometre length scale on bulk copper oxides’’. 
To study the inhomogeneity in monolayer Bi-2212, we extract A, from 
the local differential conductance spectra collected ona dense array 
of locations on samples at various doping levels, and construct the gap 
map A,(r) as shown in Fig. 5a—c. Similar to previous measurements on 
bulk Bi-2212, we find that wide, nanometre-scale variations in A, dimin- 
ish as the doping level increases in the monolayer; meanwhile A, 
averaged over the entire field of view, A,, shifts to lower energies. 
Close examination of A, histograms reveals that A, in monolayers isin 
general larger than that in bulk source crystals from which the monolay- 
ers are cleaved (Extended Data Fig. 9), and the deviation varies from 
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Fig. 6 | Electronic structure of monolayer Bi-2212 in the Mott insulating 
regime. a, Spatially averaged differential conductance spectra of monolayer 
Bi-2212 obtained between vacuum annealing cycles. The annealing 
temperature is marked on eachcurve. The spectrum labelled ‘Initial’ was 
recorded before annealing. The as-exfoliated monolayer (obtained from OD55 
crystal) was initially over-doped. The annealing cycles progressively lower its 


sample to sample. Such deviation is consistent with results from trans- 
port measurements; we attribute it to slight loss of oxygen doping (up 
to 3% in over-doped samples) during sample fabrication. The gap dis- 
tributions in monolayer and bulk, however, converge if A,is normalized 
to 4, in each gap map (Fig. 5d). This observation suggests that the 
microscopic mechanism of the A, disorder remains the same in the 
monolayer, even though the monolayer’s dielectric environmentis, in 
absence of the interlayer Coulomb interaction, very different from 
the bulk. 

Despite the large spatial inhomogeneity at high energy scale, a peri- 
odic chequerboard charge order emerges outside of the superconduct- 
ing energy gap in various bulk copper oxides””*””. Recent experiments 
show mounting evidence that a periodic modulation of Cooper pair- 
ing—thatis, a pair density wave—may coexist with the charge order?*!, 
These charge-ordered states are intimately related to the superconduc- 
tivity inthe CuO, plane””*. Animportant question is then whether these 
states persist in the 2D limit. Our conductance mapping of an under- 
doped monolayer answers the question in the affirmative. As shownin 
Fig. 5e, achequerboard pattern is resolved on the conductance map 
g(r, £) obtained at F = 20 meV. Fourier transform of the map (Fig. 5h) 
shows that the chequerboard pattern corresponds to wavevector Qco 
around 1/4 of the lattice wavevector 21t/d, along the Cu-Cu bond direc- 
tion (a, is the distance between neighbouring Cu atoms). The CO there- 
fore has areal-space wavelength of about 4d), witha correlation length 
of about 14 a, obtained from a Gaussian fit to its peak profile (Extended 
Data Fig. 10). These results agree well with bulk values”®”"**. As the 
doping level increases, the CO diminishes and eventually disappears 
in the over-doped regime (Fig. 5i, j), consistent with observations in 
bulk copper oxides”. Finally, we present evidence that pair density 
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doping level and eventually make the specimen extremely under-doped. b, 
Representative tunnelling spectra of the extremely under-doped monolayer in 
a. Inset: tunnelling conductance maps recorded at tunnelling biases of 0.2 V 
(upper panel) and 1.6 V (lower panel). Crosses mark the positions where the 
spectra are taken. Spectra are shifted vertically for clarity. 


waves also exist in monolayer Bi-2212. Here we examine spatial variation 
in the amplitude of the coherence peak in the tunnelling spectrum, 
which empirically correlates with Cooper-pair density modulation in 
bulk Bi-2212, using the procedure described in ref. *. The coherence 
peak amplitude map (of the same area as in Fig. 5a, e; Extended Data 
Fig. 11) exhibits a chequerboard pattern with a period of about 4a,—a 
clear signature of a pair density wave order. 


Electronic structure in the Mott insulating regime 


Because oxygen in monolayer Bi-2212 escapes easily at elevated tem- 
peratures, we are able to access a wide, continuous doping range in 
a single specimen by gentle annealing in ultra-high vacuum (Fig. 6a 
and Extended Data Table 2). Here we focus on the extremely under- 
doped regime, where the pseudogap and charge-ordered states start 
to emerge from the parent Mott insulator’°**. Figure 6b displays typical 
tunnelling spectra obtained onan extremely under-doped monolayer 
(see inset). The evolution of the spectra is strikingly similar to that in 
severely under-doped bulk copper oxides**™. A large charge transfer 
gap of 1.2 eV is observed on Mott insulating patches. (The gap value is 
20% larger than that in bulk Bi-2212 (ref. *"); we attribute the discrepancy 
to the tip-induced band-bending effect that is common in tunnelling 
spectroscopy studies of insulators® and 2D materials*.) Outside the 
Mott insulating patches, a broad in-gap state develops within the charge 
transfer gap, giving rise to a pseudogap-like spectra around the Fermi 
level. As in the bulk, the conductance maps at low bias and high bias 
are anticorrelated (Fig. 6b inset), which implies that the in-gap state 
comes from spectral weight transfer from the upper Hubbard 
band of the parent Mott insulator. Our results on monolayers, therefore, 


indicate that the dimensionality effect, if it exists at all, does not 
play animportant role inthe transition from Mott to pseudogap phase 
in Bi-2212. 
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Methods 


Fabricating monolayer Bi-2212 for transport measurements 
Monolayer flakes of Bi-2212 can be obtained through mechanical exfo- 
liation*. However, the size of thin flakes tends to be small because 
brittle Bi-2212 crystals break easily during exfoliation. Activating the 
SiO, surface with oxygen plasma treatment greatly increases the area 
and yield of monolayer crystals on the SiO,/Si wafer”. We attribute the 
improvement to the enhanced adhesion between Bi-2212 and SiO,—the 
plasma treatment functionalizes the SiO, surface with hydroxyl groups 
that strongly bind to Bi-2212 (ref. *). 

Our systematic investigation (Extended Data Table 1 and Extended 
Data Fig. 1) reveals that exposing the monolayers to air, albeit briefly, 
renders them insulating*’*****>. An inert Ar atmosphere preserves the 
superconductivity in the monolayers, but the protection is incomplete: 
T, is much suppressed after a prolonged fabrication process at room 
temperature, and shortening the fabrication time leads to a higher 
T.. These observations point to (1) reaction with water vapour in air 
and (2) rapid oxygen loss at room temperature as two main causes of 
degradation in monolayer Bi-2212. (The same degradation pathways 
are also present in bulk crystals** °°.) The oxygen loss, however, slows 
down considerably at moderately low temperatures, so we are able to 
obtain high-quality, intrinsic monolayers by fabricating samples ona 
cold stage kept at —40 °C inside an Ar-filled glove box with water and 
oxygen content below 0.1 ppm. 

To avoid heating during electrode deposition, we make electrical 
contacts to the exfoliated monolayers by cold-welding indium/gold 
microelectrodes on the cold stage. Once the device fabrication is com- 
plete, we seal each device ina chip carrier (we use ceramic dual-in-line 
chip carriers) with vacuum grease and acover glass inside the glove box, 
and then transfer the whole package into the cryostat. The cover glass 
comes off once the sample space of the cryostat is evacuated before 
low-temperature transport measurements, so the doping level can be 
tuned in situ. 


Fabricating monolayer Bi-2212 for STM/STS measurements 

We used a vacuum-compatible tape (Kapton tape with silicone adhesive; 
Accu-Glass Products) to exfoliate thin flakes of Bi-2212 onto Si wafer 
(covered with a 285-nm-thick SiO, layer) ina vacuum chamber witha 
base pressure of 1 x10" mbar. Few-layer Bi-2212 on the substrate exhib- 
its quantized contrast that correlates well with the number of layers 
(see also Fig. 1). The correlation makes the search (also done in UHV with 
12xUltrazoom (Navitar) through a re-entrant viewport) for monolayers 
convenient. Some of the flakes touch electrodes (Cr/Au with thickness 
of 2nmand3nm, respectively) in the form of stripes that are pre-pat- 
terned on the wafer before the exfoliation; we choose these flakes for 
STM measurements (Fig. 3a). Except for brief moments when the sam- 
ples were being transferredtotheSTMstage, the temperature was always 
kept below -120 °C. Finally, we confirm the thickness of the samples 
with AFM outside the UHV after all measurements are completed, to 
ensure that they were indeed monolayers (Extended Data Fig. 6). 


Finite-size scaling analysis of the superconductor-to-insulator 
transition in monolayer Bi-2212 
Atthesuperconductor-to-insulator transition (SIT) inmonolayer Bi-2212, 
HTS emerges from the parent Mott insulator as the sample is doped 
beyond a critical level. Such a transition is an important example of a 
continuous quantum phase transition (QPT) that is driven by an exter- 
nal parameter x at absolute zero temperature”; the quantum critical 
point at x. separates ground states with different symmetry. Exactly 
how Cooper pairs form in the 2D copper oxide plane and condense into 
the superconducting phase is a key outstanding question. However, 
crucial information on the transition can be obtained by investigating 
the scaling behaviour of R-(x, T)as x approaches x, at finite T. This is 
accomplished by finite-size scaling analysis under the general scheme 


of QPT®. Near the quantum critical point, the correlation length € 
and correlation time t become the only characteristic scales in length 

and time, respectively, and they divergeas€ « |x-x,|*and Tx €7 « |Ix-x,[” 
: v 

and zare critical exponents. The theory of finite-size scaling asserts that 

physical quantities have a scaling form that, together with exponents 

v and z, depend only on global properties of the system, but not 

on microscopic details. For 2D SIT, the appropriate finite-size scaling 

formis®: 


R(x, T)=Rf (Ix-x IT”). () 


Here the transition is driven by doping variation, so x = p; R, is the criti- 
cal resistivity at the p > p.andT > Olimit, andfisa universal scaling func- 
tion. Such scaling does not depend on the exact value of p, but the 
exponent vzand the critical resistivity R, encode the fundamental prop- 
erties of the transition. In particular, vzis determined by the universal- 
ity class that the system belongs to; its value thus provides precious 
information such as the symmetry of order parameter manifold and 
types of disorder in 2D Bi-2212 (ref. *8). 

Extended Data Fig. 4a-c illustrates the finite-size scaling analysis of 
the SITinsample A. Following the procedure described in ref. °, we first 
invert the R_(p, 7) data matrix in Extended Data Fig. 4a, and locate the 
critical point p,, where all isotherms converge to R,=R,=10.2kO 
(Extended Data Fig. 4b). We then scale the horizontal axis of Extended 
Data Fig. 4b as u = |p - p,|t(T)in Extended Data Fig. 4c. Herea single set 
of temperature-dependent parameters ¢(7) can force all curves to col- 
lapse to a universal scaling function. Further analysis shows that ¢(7) 
follows a power law dependence, t(T) « T 3 (Extended Data Fig. 5a, 
blue circles). The SIT in monolayer Bi-2212is, therefore, well described 
by continuous 2D QPT, with vz=1.53 matching the critical exponents of 
the SITs driven by ionic gating in thin films of La,_,Sr,CuO, (LSCO, ref. **), 
lithium-intercalated Bi,Sr,CaCu,Og,,(Li,Bi-2212, ref. °°), andLa,CuO,,5 
(LCO, ref. ©). The close match indicates that the SIT transitions in these 
copper oxides all belong to the same universality class, even though the 
critical resistivities differ among these systems. 

Asurvey of critical exponents in copper oxide superconductors, how- 
ever, shows that not all vz agree with the value in monolayer Bi-2212; 
various vz values were found to cluster around two different values: 3/2 
and 7/3 (refs. *8"-**: Extended Data Fig. 4b, blue squares). It therefore 
appears that the transitions fall into two distinct universality classes, 
even thoughinall copper oxide superconductors the superconductivity 
arises from doping Mott insulating CuO, planes. These observations 
raise two fundamental questions: (1) what specifically causes the dis- 
parate critical exponents in copper oxide superconductors? and (2) 
what universality classes do they correspond to? We now address these 
questions by investigating the SIT in monolayer Bi-2212 along another 
dimension inthe parameter space—the disorder level. Here we tune the 
disorder level by introducing asmall amount of air (that contains water 
vapour, the main degradation agent) into the sample chamber while 
annealing monolayer Bi-2212 at elevated temperatures. 

Extended Data Fig. 4g displays the temperature-dependent resistiv- 
ity, R(T), ofamonolayer Bi-2212 (sample C). The sample undergoes a 
sequence of annealing cyclesin10 mbar ofair (containing about 0.3 mbar 
of water vapour) at room temperature. The curves were obtained 
between each annealing cycle. We observe that the resistivity dropsto 
zero intwo steps as the temperature is lowered. The higher-temperature 
drop occurs at the apparent 7, of the monolayer, but the resistivity drops 
to zero only after a second transition at alower temperature (Extended 
Data Fig. 4g). Such atwo-step transition resembles the superconducting 
transition in 2D Josephson-coupled superconducting arrays® and is 
ubiquitous in disordered 2D superconducting systems in general”. A 
simplified picture captures the basic physics of the two-step transition: 
the disordered 2D superconductor can be modelled as superconduct- 
ing islands embedded in normal metal that provide weak Josephson 


coupling between the islands. The higher-temperature transition cor- 
responds to the superconducting transition within the islands, but the 
entire sample becomes superconducting only when the global, inter- 
island phase coherence is established after a second transition at a 
lower temperature®. 

The SIT takes place at the lower-temperature transition in this disor- 
dered monolayer Bi-2212. Because the apparent 7, does not change 
appreciably during the SIT transition (Extended Data Fig. 4g), the tran- 
sition is now predominantly driven by disorder that mainly affects the 
metallic region between the islands. Finally, we perform finite-size 
scaling analysis of the disorder-driven SIT in monolayer Bi-2212. We 
parameterize the phenomenological disorder level as 
d=const./R-(T=200 K). (The value of the constant does not affect 
our analysis; we chose const.= 213 ©.) Using the scaling form (1) with 
x=d,we obtained a critical exponent of vz=2.35 which is close to 7/3. 
The same analysis ona less disordered monolayer yields a similar vz 
(Extended Data Fig. 4d and Extended Data Fig. Sa). 

We can now explain the two disparate critical exponents observed in 
copper oxide superconductors. We first note that the two distinct critical 
exponents in monolayer Bi-2212 confirm early observations that SITsin 
copper oxide superconductors fall into two universality classes. The 
mystery is, however, resolved—our results show that the two universal- 
ity classes characterize the doping-driven SIT in the clean limit and the 
disorder-driven SIT inthe dirty limit, respectively. The exponent vz=7/3 
points towards a quantum percolation model that indeed describes a 
strongly disordered superconductor®. Meanwhile, vz=3/2 encodes the 
essential physics of an intrinsic copper oxide superconductor in both 
bulk and 2D limits. The fact that bulk and monolayer Bi-2212 belong to 
the same universality class suggests that the antiferromagnetic order 
found in bulk Bi-2212 may persist in the monolayer. The microscopic 
origin of vz = 3/2 however, remains an open question that requires 
further investigation. 


Data availability 


The datasets generated and analysed during the current study are avail- 
able from the corresponding author on reasonable request. 
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Extended Data Fig. 1| Transport properties of typical monolayer Bi-2212 
samples fabricated by various methods. a, Temperature-dependent 
resistance of monolayer Bi-2212 samples. Here (#1)-(#5) refer to five typical 
samples fabricated by different methods indicated in Extended Data Table 1. 

b, Resistance of atypical cold-welded Bi-2212 monolayer device measured with 


200 


two-terminal (blue) and four-terminal (red) configurations. The four-terminal 
configuration is adopted in all our measurements presented inthe main text, 
because it eliminates spurious signals from electrical contacts. The two- 
terminal resistance in the superconducting state gives an estimate of the 
contact resistance of the order of 1Q. 


1000 


R(Q) 


500 }— 


Annealing 


Extended Data Fig. 2| Temperature-dependent resistance of amonolayer 
Bi-2212 sample annealed in ozone. Annealing cycles were performed under an 
O; partial pressure of about 50 Paat temperatures between 220 K and 240K. 

O; was purged with helium gas between annealing cycles, and data were 
obtained in helium vapour. Each annealing cycle lasts 5-30 min. Monolayer 


100 200 


T(K) 


Bi-2212 was initially at optimal doping (black curve). The annealing cycles 
progressively increase the doping level of the sample. The red curve was 
obtained after first annealing, and blue curve was obtained after second 


annealing. 
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Monolayer Bulk 


Extended Data Fig. 3| Extracting T,and 7* from temperature-dependent 
resistance of monolayer Bi-2212. a, Illustration of 7,and 7* extraction from 
temperature-dependent resistance (black curve, which mostly overlaps with 
the red curve) and its derivative (blue curve). We used two definitions of T.in 
our analysis: (i) Ties Where the slope of resistance vs temperature curve is 
maximum”; (ii) 7.9 from fitting with Aslamasov-Larkin paraconductivity 
model”! Aa=0(T) - Onormai(T) = A(T /Te- 1) 1. Near optimal doping, 

Onormal(? ) =(bT +c) +, So T-g can be extracted from fitting with 


Monolayer Bulk 


R(T) =(6T+c)(T- T.o)/(T- Teo + a) (red curve). T* is determined as the 
temperature at which the derivative of temperature-dependent resistance 
deviates from constant value (broken blue line; ref. ”). b,c, T edit (b) and TP" (ec) 
of monolayer and bulk Bi-2212. Bulk data were obtained from optimally doped 
crystals (OP88). Under both definitions, the highest maximum 7, of 
monolayers is within the statistical uncertainty range of the 7, in optimally 


doped bulk crystals. 
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Extended Data Fig. 4 | See next page for caption. 
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Extended Data Fig. 4 | Superconductor-insulator transition in monolayer 
Bi-2212.a, Temperature-dependent resistivity R(p, T)ofsample A. The doping 
level, fixed for each curve, is tuned by repeated annealing cycles under vacuum 
(pressure below10‘* mbar). The initially superconducting sample becomes 
insulating viaa QPT. Broken line marks the separatrix where the transition occurs. 
Blue shaded region indicates the temperature range in which we perform the 
finite-size scaling analysis; the slight up-turn in resistivity at lower temperatures 
suggests intermediate phase or additional QCP between the superconducting 
and insulating phases*’. b, Same dataset ina plotted inversely, that is, R(p, T) 
plotted asa function of doping level at fixed temperatures between 6 K and 24K. 
Each colour refers toa fixed temperature. Continuous curves are interpolations 
of data points at different temperatures. The point where all curves cross defines 
the critical point the QPT, (R, =10.2+0.6 kO, p, = 0.022+0.002).c, Scaling of the 
same data with respect to variable u = p -P. \¢(7). Asingle set of temperature- 
dependent parameters ¢(7) can force all data to collapse toa universal scaling 
function on both sides of the SIT. d, Temperature-dependent resistivity of 
sample B. Data were obtained between annealing cycles performed under 

10‘ mbar of air that contains about 3 x 10 >mbar of water vapour. The annealing 
cycles progressively increase the normal state resistivity, and induces SIT inthe 
monolayer. Blue shaded region marks the temperature range in which we 
perform the finite-size scaling analysis.e, Same resistivity dataind plotted asa 


function of x=194 O/R,(T=200 K). Herexis a phenomenological variable 

that parametrizes the external factor (doping or disorder level) that drives 

the SIT; the precise value of x does not affect the finite-size scaling analysis 
according to formula (1). The critical point of the SIT is identified as 

(R, =8.740.6 kO, x, = 0.022+0.002). f, Scaling analysis of the dataset ine. The 
analysis yields a critical exponent of vz=2.45. The vzdiffers from the critical 
exponent in doping-driven SIT insample A, but coincides with the value in 
disorder-driven SIT in sample C. Similar to sample C, sample Balso features a 
two-step superconducting transition (marked by black arrow) that indicates 
considerable amount of disorder. We therefore conclude that disorder level 
drives the SIT insample B. g, Temperature-dependent resistivity of sample C. 
Curves are obtained between annealing cycles performed under about 10 mbar 
of air. Such annealing cycles introduce disorders into the monolayer, and the 
superconductivity transition occurs in two steps. The disorder-driven SIT takes 
place at the lower-temperature transition (blue shaded region). h, Inverse of the 
dataset ing. Horizontal axis represents the phenomenological disorder level 
that is parametrized as d= 213 O/R_(T=200 K). Smooth interpolations of the 
data points cross at the critical point (R, =2.86+0.17 kO,x, = 0.028+0.002). 

i, Scaling of the same data inh with respect to variable u = |d-d,|t(T).t(7 )is 
chosen such that all data collapse to a universal scaling function. 
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Extended Data Fig. 5 | Critical exponents of superconductor-insulator exponents vz obtained in monolayer Bi-2212 (red circles) and various other 
transitions in copper oxide superconductors. a, Temperature-dependent copper oxide superconductors (black squares). All vz fall into the 
parameter ¢(7) obtained from finite-size scaling analysis in Extended DataFig.4. | neighbourhood of one of the two values, 3/2 and 7/3, that characterize the SITin 
Values of t(T) from all three monolayer Bi-2212 samples follow power-law the clean and dirty limit, respectively (see text). Solid vertical lines mark the 
dependence; the slope of the line fits (solid lines) yields the critical exponents mean, and broken lines the standard deviation, of the vzvaluesin each category. 


of the SIT vz=1.53, 2.45 and 2.35 for samples A, Band C, respectively. b, Critical 
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Extended Data Fig. 6 | Characterization of monolayer Bi-2212 after STM 
measurements. a, Optical image of typical Bi-2212 flakes exfoliated on SiO,/Si 
substrate. The monolayer (light purple region inthe centre) is identified from 
its optical contrast. b, A magnified view of the area marked by the square ina. 

c, AFM topography of the area marked by the square inb. Both the optical image 


Height (nm) 


Lateral displacement (um) 


and the AFM topography were obtained in an Ar atmosphere inside a glove box 
after STM measurements performed in UHV. d, Line cut of the AFM topography 
along the line shown inc. The step height of about 1.6 nm confirms that the 
Bi-2212 flake measured in STM was indeed a monolayer. 


14 meV 


Pr: SS Se 
Fa OS ee 
Extended Data Fig. 7 | Fourier transform of the conductance ratio map are obtained froma set of 200 x 200-pixel conductance maps taken onan area 
obtained on monolayer Bi2212 at various energies. Each panel displays a of 500 Ax500 Awithan energy resolution of 2 meV. Data were obtained from 
Fourier transform of the conductance ratio map Z(r, £) of nearly optimally the same sample in Fig. 4 (here we show the full dataset). 


doped monolayer Bi-2212 at the energy labelled on the panel. The Z(r, £) maps 
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Extended Data Fig. 8 | Energy dispersion of the q-vectors. Amplitudes of 
measured q;,(in units of 21t/a,) are plotted as functions of energy (i=1... 7,except 
that q, and q, are too weak to be detected). We followed the method described 


in ref.”> to obtain q,. Solid lines are energy dispersion of the q-vectors expected 
inthe octet model. 
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Extended Data Fig. 9 | Histograms of A, (r) gap mapsin monolayer and bulk 
Bi-2212. Solid and empty symbols represent data from monolayer and bulk Bi- 
2212, respectively. A, distributions in monolayers shift towards higher energies 
compared with those in bulk crystals. The shift reflects slight loss of oxygen 
doping during monolayer sample fabrication. Specifically, the doping level pis 
directly related to the average value of the pseudogap. From the average 
pseudogap, we estimate that p= 0.06+0.02, 0.16+0.02 and 0.19+0.02 for 


Ay (meV) 


monolayers obtained from UD5O, OP88 and ODS55, respectively?>**”*. These 
values are lower than the doping levels extracted in the bulk crystals 
(p=0.08+0.02, 0.17+0.02 and 0.22+0.01 for UD50, OP88 and ODSS, 
respectively). Here we used the relations 2A, = 152 meV x (0.27 - p)/0.22 for 
0.1<p<0.22and 2A, =85 meV x (0.12 - p)/0.02 for 0.06<p<0.08to estimate 
the doping level in both bulk crystals and monolayers. 
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Extended Data Fig. 10 | Wavevector of the CDW order in monolayer Bi-2212 
obtained inUDS5O. Line cut (blue line) of the FFT of g(r, £=20 meV)mapinFig.5h 
along the Cu-O bond direction exhibits a peak at q..= 0.25 (211/a) that is 
associated with the charge-ordered state. The magenta line is a Gaussian fit to 
the peak plus a decaying exponential background. The full-width at half- 
maximum of the peak yields a correlation length of about 14a. 
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Extended Data Fig. 11| Pair density wave in monolayer Bi-2212. a, Four 
representative conductance spectra (d//dV; upper panel) and the negative of 
their second derivative (D = - d°J/dV?; lower panel) in under-doped monolayer Bi- 
2212 obtained from UDSO. We additionally define H= d//dV(E=A,) - d//dV(E=0) 
, which corresponds to the amount of low-energy DOS gapped out by Cooper 
pairing (here Ay =15 meV). The pair density wave can be visualized by spatially 
mapping either H or D (ref.*”).b, H(t) map ona 40 nm x 40 nmarea. A 
chequerboard pattern is clearly resolved. c, Fourier transform of the H(r) map 

in b. Peaks at |q| = (0.25+0.02)211/a, (marked by broken circles) along the Cu-O 
bond directions indicate the emergence of pair density wave order”. d-h, D(r) 
maps obtained on the same area in bat various energies. i-m, Fourier transform 
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of the D(r) maps in d-h. The |q| = 211/4a, spatial modulations at F=15 meV 
(broken circles inj) again indicate the existence of pair density wave”. Red 
crosses mark q = (0, +1t/dg) and (+1t/dg, 0). We followed the method described in 
ref. to obtain H(r) and D(r) maps. First, a set of conductance (d//dV) spectra was 
taken ona160 x 160 grid over the 40 nm x 40 nmarea. Here we used a set-point 
bias voltage of -300 mV, which is far beyond the energy scale of the charge- 
ordered state, to eliminate possible set-point effects. We then fitted each d//dV 
spectrum with a second-order polynomial, and took the second derivative of the 
polynomial to obtain the D spectrum. The H(r) map is directly obtained from the 
di/dVspectra grid. 
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Extended Data Table 1| Optimizing fabrication process for monolayer and bilayer Bi-2212 samples 


pa Total fabrication | Bulk # ~~ of 
Contact method exposure 5 3 3 Te 
. time in glove box | crystal devices 
time 
Metal evaporation at room temperature 5 min 4h OP88 3 50-70K 
Pre-patterned bottom contact 2s 2h OP88 1 ~70K 
Bilayer 
Pre-patterned bottom contact 2s 2h ODS55 1 80-90K 
Cold welding / 2h OP88 4 80-90K 
Metal evaporation at room temperature 5 min 4h OP88 2 Insulating 
Prepatterned bottom contact”) 2s 2h OP88 3 Insulating 
Pre-patterned bottom contact 2s 2h ODS55 1 Insulating 
Metal evaporation at room temperature” / 4h OP88 2 Insulating 
Monolayer : 
Metal evaporation at low temperature (~ 100 K)® | / 4h OD55 3 <10K 
Cold welding / 2h OP88 2 <40K 
Cold welding / 2h OD55 4 70-80K 
Cold welding») / 0.5-1h OD55 14 80-90K 


We have systematically investigated the effects of the following key factors on the transport properties of monolayer and bi 


ayer Bi-2212: contact method, air-exposure time and total fabrication 


time in glove box. We observe that monolayer Bi-2212 is more prone to degradation than is bilayer graphene. In particular, exposure to air is most detrimental to sample quality of the monolayers. 
lerable degradation. We find that cold-welding indium contacts in the glove box preserves the monolayer sample quality. 
Here, thin indium foils make stable contacts with a thick flake that is connected to the monolayer, so that the thick flake electrically bridges the indium electrodes and the monolayer (Fig. 1e). 
The monolayer samples exfoliated from an over-doped crystal (OD55) are slightly over-doped, and their maximum T, is comparable to the T, of optimally doped bulk crystals (Fig. 2c). 


Evaporating metal contacts (through shadow mask) also causes consid 


(#)-( 


Transport properties of typical monolayer samples from these categories are shown in Extended Data Fig. 1. 


Extended Data Table 2 | Annealing sequence of monolayer Bi-2212 


Annealing temperature Annealing time Doping regime A, (meV) 
As-exfoliated ii Over-doped 28+1 
25 °C 1 week Nearly optimally doped 40+1 
130 °C 30 min Under-doped 56+1 
220°C 30 min Under-doped 12244 
265 °C 30 min Extremely under-doped 250 + 20 


This sequence relates to the monolayer Bi-2212 shown in Fig. 6. The as-exfoliated monolayer Bi-2212 (over-doped; 4; = 28+1 meV) was annealed under UHV with a base pressure of 1 10"? mbar. 
The pseudogaps 4: were extracted from spatially averaged conductance spectra. 
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Shape-morphing systems, which can perform complex tasks through morphological 
transformations, are of great interest for future applications in minimally invasive 
medicine, soft robotics* ®, active metamaterials’ and smart surfaces®. With current 
fabrication methods, shape-morphing configurations have been embedded into 
structural design by, for example, spatial distribution of heterogeneous materials 
which cannot be altered once fabricated. The systems are therefore restricted toa 
single type of transformation that is predetermined by their geometry. Here we 
develop a strategy to encode multiple shape-morphing instructions into a 
micromachine by programming the magnetic configurations of arrays of single- 
domain nanomagnets on connected panels. This programming is achieved by 
applying a specific sequence of magnetic fields to nanomagnets with suitably tailored 
switching fields, and results in specific shape transformations of the customized 
micromachines under an applied magnetic field. Using this concept, we have built an 
assembly of modular units that can be programmed to morph into letters of the 
alphabet, and we have constructed a microscale ‘bird’ capable of complex behaviours, 
including ‘flapping’, ‘hovering’, ‘turning’ and ‘side-slipping’. This establishes a route for 


9-14 
, 


the creation of future intelligent microsystems that are reconfigurable and 
reprogrammable in situ, and that can therefore adapt to complex situations. 


Ithas been a long-standing goal to create intelligent machines that are 
untethered and can execute tasks at small scales. Magnetic actuation 
is of particular interest for the control of these machines, because 
it comes with the advantage of being able to perform tasks in con- 
fined and enclosed spaces®. In magnetic shape-morphing systems, 
a mechanical torque is generated when the magnetization of a mag- 
netic medium is not in line with the applied magnetic field’®. Previous 
programming of the magnetic configurations could be achieved by 
reorienting permanent-magnet microparticles in millimetre-sized 
devices’®”!8, For micrometre-sized devices, programming methods 
suchas aligning superparamagnetic nanoparticles”” (regioni in Fig. 1a) 
and selective coating with soft magnetic thin films” (region iiiin Fig. 1a) 
have been demonstrated. In this work, we used stadium-shaped single- 
domain nanomagnets to encode shape-morphing information into 
micromachines. The use of nanomagnets with lateral dimensions in an 
intermediate range, between 100 nm and 500 nm (regioniiin Fig. 1a), 
means they are single domain with a stable remanent magnetization 
and have a tunable magnetic anisotropy at room temperature”. 
Asaresult of the magnetic shape anisotropy, the magnetization is paral- 
lel to the long axis of the magnets, pointing in one of the two directions. 
By implementing arrays of these nanomagnets in a micromachine, 
the magnetic configuration can be remotely programmed by apply- 
ing a sequence of magnetizing fields to store the shape-morphing 
information. 


Inspired by origami’, the art of paper folding, the micromachine is 
designed with two types of component: rigid panels, some of whichare 
made functional with arrays of single-domain nanomagnets covering 
the panel surface; and structured ‘soft’ spring hinges as the connecting 
creases. In the micromachine design shown in Fig. 1b, there are four 
panels patterned with 60-nm-thick nanomagnets that can be remotely 
encoded and manipulated, and one central passive panel (see Extended 
Data Fig. 1 for ascanning electron microscopy, SEM, image). Arrays of 
nanomagnets with different magnetic switching fields are fabricated 
onopposite panels—for example, panels | and Il in Fig. 1b. The switching 
fields of the nanomagnets were engineered by varying the aspect ratios 
of the nanomagnets while maintaining the same volume (see Meth- 
ods section ‘Nanomagnet design and coercivity’ and Extended Data 
Fig. 2). As given by the magnetic hysteresis loops in Fig. 1c (see Meth- 
ods section ‘Magnetic characterization and encoding’), the coercive 
fields B. required to switch the magnets range from about 30 mT for 
low-aspect-ratio ‘wide’ nanomagnets (300 nm x 110 nm, nanomagnet 
type IV), to about 140 mT for high-aspect-ratio ‘narrow’ nanomagnets 
(520 nm x 60 nm, nanomagnet type I). The square shape of the loops 
indicates that they are fully magnetized at remanence. Arrays of type 
land type II nanomagnets are patterned on the opposite panels of the 
four-panel micromachine, and have switching fields of B,(1) = 140 mT 
and B.(II) = 90 mT, respectively. Type Ill and type IV nanomagnets with 
BAI) = 70 mT and B,(IV) = 30 mT are employed later in this work. 
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Fig. 1| Design of a four-panel shape-morphing micromachine. a, Schematic 
diagram of the magnetic states found in magnets with increasing size: 

i, superparamagnetic; ii, stable single domain at room temperature; and 

iii, multidomain state. The red arrows indicate possible magnetization 
directions. The red trace schematically illustrates the size dependence of the 
coercivity of the magnets. Nanomagnets in regionii (the shaded area) are 
implemented in this work. b, Top, four-panel micromachine with an array of 
520 nm x 60 nm (type I) nanomagnets on panel I and 398 nm x 80 nm (type ll) 
nanomagnets on panel II; bottom, corresponding SEM images of the 
nanomagnetarrays. The zig-zag hinge spring has six turns. c, Magneto-optical 
Kerr effect hysteresis loops of single-domain nanomagnets with the same 
volume but with six different aspect ratios. The lateral dimensions are 
indicated to the right of the vertical axis innanometres. The coercive fields of 


The micromachine can then be encoded by aseries of magnetizing 
fields, shown schematically in Fig. 1d. By applying a magnetic field 
B,>B,(1) along the x direction, both typeI and type II nanomagnets 
are magnetized with full remanent magnetization. A second, lower, 
magnetic field B,(I) > B, > B,(Il) is then applied in the opposite direc- 
tion to remagnetize the type II magnets in the opposite direction, so 
that the magnetization of the arrays on the two panels point head-to- 
head along the x direction (orange and turquoise horizontal arrows 
in Fig. 1d). Similarly, applying B, and B, fields sequentially along the 
y direction gives head-to-head magnetization for the other two 
panels, so that the device has all four panels magnetized towards 
the centre. Furthermore, using magnetizing field protocols with 
different combinations of B, and B, fields, the same micromachine 
can be magnetized into different magnetic configurations and, since 
each panel can be magnetized in one of two opposite directions, 


ote 
i \ 
the type I-IV nanomagnets are B,(1) = 140 mT, B, (II) = 90 mT, B,(III) = 70 mT and 
BAIV) 30 mT. d, Schematic of the encoding of the micromachine using two 
fields, B, > B,(1) (large enough to switch typeI and type II nanomagnets) and 
BAI) > B, > B.(Il) (large enough to switch type ll, but not typeI, nanomagnets) 
applied along both the horizontal and vertical directions (see main text for 
details). e, Schematics of the magnetic configurations (withtypelandtypell 
nanomagnets) and micromachine folding behaviour on application of the 
controlling magnetic field B=15 mT, with optical microscope images showing 
the four different conformations of the fabricated devices. Going from left to 
right, the numbers of panels folding up/down are 4/0, 3/1, 2/2 (opposite panels 
having different folding directions) and 2/2 (opposite panels having the same 
folding direction). Scale bars: 500 nm (b), 10 pm (allimages ine). 


there are a total of 2* = 
micromachine. 

After programming the magnetic configurations, the micromachine 
is released from the substrate (see Methods section ‘Sample fabrica- 
tion’) and actuated with an applied magnetic field B that provides a 
magnetic torque t=m x Bon the panels, where mis the total magnetic 
moment of the nanomagnet arrays ona given panel. All of the panels 
patterned with nanomagnets try to align with the applied magnetic 
field direction, which is counterbalanced by the mechanical torque 
from the deformed hinge springs. (For spring designs and mechanical 
calculations, see Methods section ‘Hinge spring design and properties’ 
and Extended Data Fig. 3) When actuated by the controlling field, the 
four types of magnetic configuration give four distinct conformations, 
as shown in Fig. leand Supplementary Video 1. The transformations of 
our micromachines require actuation fields (<15 mT) that are smaller 


16 magnetic configurations for the same 
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Fig. 2| Encoding letters of the alphabet into ashape-morphing 
micromachine assembled from an array of 4 x 4 four-panel units. 

a, Conceptual design of a micromachine with 16 four-panel units that can be 
encoded to transform into letters ‘P’ and ‘X’, as illustrated by the pair of 
schematics on the left. Each box inthe schematic represents a four-panel unit, 
where all the magnetizations of the panels ina given unit can be encoded to 
point outwards or inwards. Consequently, the central panel will move down or 
up when an out-of-plane controlling field Bis applied and we assign these to be 
the ‘O’ state or the ‘1’ state, respectively, as illustrated by the insets onthe right. 
In order to combine mode ‘P’ and mode ‘X’ intoa single micromachine, as 
illustrated by the central schematic, each unit is assigned one of four possible 
coding states given by (P, X) =(0, 0), (1,1), (1, 0) or (0, 1). b, Design of the four 


than the switching fields, which are in the range 30-140 mT for the 
type I-IV nanomagnets. The complete set of shape transformations 
for all 16 magnetic configurations for this four-panel micromachine 
are shown in Extended Data Fig. 4. There are four distinct conforma- 
tions (indicated by the four different background colours) due to the 
four-fold rotational symmetry of this particular micromachine. Nev- 
ertheless, the full programmability is given by the 16 different mag- 
netic configurations, which can give additional conformations when 
assembling the micromachines into multicomponent devices or when 
using an asymmetric machine design. We have therefore demonstrated 
that, by encoding magnetic configurations through magnetizing field 
protocols, micromachines having the same structural design can be 
programmed with different shape-morphing behaviours. 
Multicomponent shape-morphing micromachines canbe constructed 
by assembling modular units such as the four-panel devices shown in 
Fig. le. We first built a micromachine by assembling the same modular 
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different four-panel units with the four different coding states. Each panel 
contains an array of identical magnets whose orientation is given by the 
coloured bar onthe panels. The colour of the bars corresponds to the colour of 
the hysteresis loops for the four different types of nanomagnets, I-IV, shownin 
Fig. 1c.c, Encoding the magnetization in the arrays of nanomagnets of the 
micromachine for the ‘P’ and ‘X’ shape morphing. The magnetization 
directions are given by the pairs of arrows labelled M, and the arrows within the 
boxes, which are coloured according to the four types of nanomagnet. 

d, Schematics and corresponding optical microscope images of the fabricated 
devices encoded for ‘P’ and ‘Xx’ shape morphing. The devices are actuated with 
B=15 mT. Scale bar (for both images), 20 pm. 


units into a3 x 3 array, which provides four distinct conformations (see 
Extended Data Fig. Sand Supplementary Video 2). Furthermore, multiple 
tailored conformations can be attained by customizing the individual 
units and encoding their magnetic configurations. To demonstrate this, 
we engineered a micromachine that can transform totwo distinct letters 
of the alphabet, ‘P’ and ‘X’, using different nanomagnetic encoding, as 
shownin Fig. 2. For this, we first selected a4 x 4 assembly of the four-panel 
units (Fig. 2a) and, when applying the controlling magnetic field, each 
of the units moves either ‘up’ (magnetization on all four panels points 
inwards, represented bya‘I’ state), or ‘down’ (magnetization onall four 
panels points outwards, represented by a‘O’ state), with the set of ‘I’-state 
units representing a letter. Building both mode ‘P’ and mode ‘x’ into a 
single machine gives four distinct coding states—(P, X) =(0, 0), (1, 1), (1,0) 
or (0, 1)—for each unit, where the first digit is associated with the ‘P’ mode 
and the second digit is associated with the ‘X’ mode. We created four 
types of assembly unit, each having one of the four coding states, which 
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Fig. 3 | Origami-like microscale ‘bird’ with multiple shape-morphing modes. 
a, Schematic of the folding behaviours of two-panel devices when actuated 
using an applied out-of-plane magnetic field. These are achieved by setting the 
magnetization perpendicular, parallel or at an angle to the folding crease, as 
indicated by the red arrows on the panels. b, Schematic of the panel design 
using type I-IV nanomagnets with the magnet long axis oriented along x andy 
directions and the colour of the nanomagnets corresponding to the colours of 
the magnetic hysteresis loops in Fig. 1c. c, SEMimage of the microscale ‘bird’. 
The coloured bars indicate the location of the arrays of type I-IV nanomagnets 
and the orientation of the nanomagnets. d, Schematic (left) and optical images 
(right) ofa microscale ‘bird’ mimicking four flying modes, ‘flapping’, 
‘hovering’, ‘turning’ and ‘side-slipping’. From left to right in each row: encoded 
magnetization direction (coloured arrows) of the type I-IV nanomagnets with 
the colours corresponding to the colours of the magnetic hysteresis loopsin 
Fig. 1c; schematic of a flat microscale ‘bird’ with the total magnetization 
direction for each panel indicated with red arrows; schematic showing the 


was achieved using different combinations of arrays of the four types of 
nanomagnet, I-IV (Fig. 2b). The micromachine was then constructed 
from these four assembly units, corresponding to the arrangement of 
the coding states shown in Fig. 2a. The micromachine is then encoded 
so that one set of magnetic configurations of the units represents ‘P’ and 
the other represents ‘X’ (Fig. 2c). After releasing the encoded devices 
from the substrate, the micromachines display shape-morphing into 


folding of the microscale ‘bird’ under the indicated controlling field B; and the 
optical microscope images of the experimental demonstrations. For ‘flapping’, 
the three optical images show the shape transformation in different controlling 
fields B. For ‘hovering’, shape transformations in a magnetic field (1.5 mT, 1Hz), 
which rotates back and forth, are shown with the three optical images 
corresponding to three successive field directions during the field rotation. 
For ‘turning’, three successive snapshots of the shape transformations inan 
alternating magnetic field (11.6 mT, 24.5 Hz) are shown, and for ‘side-slipping’, 
three successive snapshots of the shape transformation in an alternating 
magnetic field (6.2 mT, 19.5 Hz) are shown. The solid black arrows indicate the 
direction of the applied magnetic field. In the optical images of the ‘turning’ 
and ‘side-slipping’ modes, a dashed red line connects a fixed reference point on 
the substrate (large red dot) and the middle point between the two wings of the 
‘bird’ (small red dot), highlighting the motion of the ‘bird’. Scale bars: c, 15 um; 
d,30um. 


‘P’ and ‘X’ patterns when actuated by an applied controlling field (see 
Fig. 2d and Supplementary Video 3). In addition to these two patterns, 
this micromachine design also has‘9’ and ‘0’ modes, which are conjugate 
modes of ‘P’and‘X’ modes, respectively (see Extended Data Fig. 6). This 
modular design concept can therefore be used to create complex shape- 
morphing systems with tailored three-dimensional (3D) transformations 
by customizing the design and layout of the functional units. 
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More advanced folding behaviours, such as the bending and twist- 
ing shown in Fig. 3a, can be achieved by programming the rigid pan- 
els of the shape-morphing micromachines with arbitrary magnetic 
configurations. The magnetic moment of the panels can be tuned 
by including arrays of nanomagnets with different aspect ratios and 
orientations (see Methods section ‘Design of nanomagnet arrays’). 
In the example shown in Fig. 3b, with four types of nanomagnet I-IV 
oriented along boththex and y directions, there are 2* x 2*=2° magnetic 
configurations ona single panel. By adjusting the quantities n, and n, 
of each type of nanomagnet along the x andy directions, it is possible 
to assign to each panel a total magnetic moment of almost arbitrary 
magnitude and direction. As a demonstration, we have engineered an 
origami-like microscale ‘bird’ to mimic the different flying modes ofa 
real bird (see Fig. 3c). This is achieved with specific arrangements of the 
nanomagnetarrays on the five body parts—head, neck, body, tailanda 
pair of wings. For the tips and joints of the bird’s wings, there are four 
arrays of nanomagnets on each panel with different orientations and 
switching fields. Therefore the total magnetization on these panels has 
eight possible directions and one demagnetized state (see Extended 
Data Fig. 7). By encoding the nanomagnets on the microscale ‘bird’ 
with different magnetic configurations, we demonstrate four distinct 
morphological transformations—‘flapping’, ‘hovering, ‘turning’ and 
‘side-slipping’ (see Fig. 3d and Supplementary Videos 4—7)—achieved 
by varying the magnetic fields, as indicated in Fig. 3d. Here advanced 
folding behaviours are demonstrated; for the ‘hovering’ mode, the left 
and right wings ‘twist’ relative to the body, and in the ‘side-slipping’ 
mode, the right wing tip ‘bends and twists’ relative to the right wing 
joint. Ways to achieve additional transformations are discussed in 
Methods section ‘Further transformations’. 

For future applications, our micromachines have a wide range of 
tunability in terms of size, ranging from submicrometre-sized panels 
with a single nanomagnet up to millimetre-sized devices: this range 
is limited only by the fabrication methods available. The nanomagnet 
arrays can be further engineered to have temperature-dependent mag- 
netic properties”, and they can be modulated using radio-frequency 
magnetic fields and light”*. This possibility of control with several dif- 
ferent stimuli provides further functionality of the micromachines in 
many different environments. Nanoscale magnets switch in only a few 
nanoseconds”, which is much faster than the mechanical response of the 
micromachines that occurs on the millisecond timescale™. Therefore, the 
micromachines can be reprogrammed in situ using a short (nanosecond 
to millisecond) magnetic field pulse. With the ability to precisely con- 
trol transformations at the micrometre scale, our micromachines also 
offer a platform to construct 3D magnetic metamaterials”, such as a3D 
realization of artificial spin ice”®, and photonic metamaterials”, where 
optical properties, suchas the polarization of transmitted light, can be 
tuned by magnetically actuated transformations. This concept can also 
be applied in flexible electronics, with morphable 3D structures having 
multistable states*°. By encoding the nanomagnets, the devices can be 
readily switched between these states using an applied magnetic field. 
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Methods 


Sample fabrication 

Aschematic of the fabrication process can be found in Extended Data 
Fig. 8. The samples were fabricated on a50-nm-thick low-stress silicon 
nitride membrane (Silson Ltd, UK) supported by a rigid silicon frame. 
First, the nanomagnets were fabricated on the membrane with elec- 
tron beam lithography using an electron beam writer (Vistec EBPG 
5000PlusES) to pattern a spin-coated 50k poly(methylmethacrylate) 
(PMMA)/950k PMMA double-layer. A magnetic film of 5 nm Ti (adhe- 
sion layer)/60 nm Co/3 nm Al (capping layer) was thermally evapo- 
rated at a base pressure of ~1 x 10° mbar onto the patterned resist, 
which was followed by a lift-off process in acetone. Then the panels 
were fabricated by spin-coating a 950k PMMA layer onthe front of the 
silicon nitride membrane, which was then patterned using electron 
beam lithography in order to define the geometry of the rigid panels 
and hinge springs of the micromachines. After coating a second 950k 
PMMA layer on the back of the membrane, reactive ion etching (RIE, 
Oxford PlasmaPro 100) was performed on the front side to etch through 
the 50-nm-thick silicon nitride membrane. After the RIE process, the 
fabricated micromachines were only supported by the free-standing 
PMMA layer coated on the back of the membrane. The devices were 
then magnetized by a sequence of magnetic fields and released in an 
organic solvent, propylene glycol monomethyl ether acetate (PGMEA). 
A further discussion of this release process and possible operation of 
the micromachines in other media can be found in Methods section 
‘Micromachine release and operation’. The SEM (Zeiss Supra VP55) 
images were taken with a 3-10 kV acceleration voltage. 


Magnetic characterization and encoding 

The nanomagnets were characterized with MOKE measurements using 
acommercial setup (NanoMOKE, Durham Magneto Optics Ltd.) inthe 
longitudinal mode witha focused laser spot witha diameter of -10 pm. 
Each hysteresis loop was obtained by averaging ten measurements. The 
nanomagnet arrays were encoded using the magnetizing field from 
the electromagnet in the NanoMOKE setup. The electromagnet was 
equipped with a manual rotation stage to hold the sample between 
the electromagnet iron poles. By rotating the sample on this stage, 
a magnetic field of up to 400 mT could be applied in any direction in 
the sample plane. 


Magnetic manipulation 

After the sample fabrication and the magnetic field encoding, the sam- 
ple was released in an organic solvent (PGMEA) while being observed 
with an optical microscope incorporating three pairs of Helmholtz coils 
inan orthogonal configuration. Acomputer was used to simultaneously 
control the electric currents in the three-pair Helmholtz coils to pro- 
duce a magnetic field vector in arbitrary 3D directions. A magnetic field 
with amaximum magnitude of 15 mT was generated to manipulate the 
released micromachines. The observed folding behaviour and motion 
were captured with a video camera (Grasshopper GRAS-03K2C, Point 
Grey Research) on the microscope. The tested micromachines were 
actuated reversibly with a dynamic field more than18,000 times (30 Hz, 
10 min) with no observable signs of plastic deformation or breaking. 


Nanomagnet design and coercivity 
The magnetic domain state of an element made of a soft polycrystal- 
line ferromagnetic material (typically Fe, Niand Co) is dictated by the 
competition between the quantum mechanical exchange energy and 
the stray field energy that is size- and shape-dependent”. By varying 
the lateral size and shape, it is possible to create single-domain nano- 
magnets. 

The nanomagnets are designed in a stadium shape, comprising a 
rectangle with one semi-circle at each end, with the width of the rec- 
tangle matching the semi-circle diameter, as shown in Extended Data 


Fig. 2a. This shape ensures that the remanent magnetization is parallel 
to the long axis of the magnets, pointing in one of the two directions. 
The volume of the stadium-shaped nanomagnet is given by: 


d 2 
V= a -d)+ (5) ; (1) 


When the length, width and thickness of the nanomagnets are 
L=300nm, d=110 nm and t=60 nm, respectively, the volume of the 
nanomagnet is V)=1.82 x10 m?. The total magnetic moment ofa nano- 
magnetis given by m= MV, where Mis the saturation magnetization of 
the magnetic material. In this work, 60-nm-thick cobalt nanomagnets 
are employed, and the saturation magnetization of the thermally evapo- 
rated Co thin film is M,=1,153 kA m“‘, measured using a superconducting 
quantum interference device vibrating sample magnetometer (SQUID 
VSM) at room temperature. 

Nanomagnets with different aspect ratios, L/d, display different mag- 
netic coercivities, as shown in Fig. 1c. Here we keep the same nanomag- 
net volume V, and thickness ¢, = 60 nm for all nanomagnets regardless 
of the aspect ratio. Therefore, they have the same magnetic moment 
m,=MV, and equation (1) can be modified to read: 


d 2 
Vo= aw - d) & (5) h (2) 


So that the relation between L and dis given by: 


pppoe) (3) 


A plot of d against L, determined using equation (3), is shown in 
Extended Data Fig. 2c. In this figure, nanomagnets with the same vol- 
ume V, but with six different aspect ratios are selected, that is, with 
lateral dimensions of 520 nm x 60 nm, 450 nm x 70 nm, 398 nm x 80 nm, 
358 nm x 90 nm, 326 nm x 100 nm and 300 nm x 110 nm. The layout 
of the nanomagnet arrays is schematically shown in Extended Data 
Fig. 2b, with a spacing of d/2 inthe x direction. In this layout, the dipolar 
coupling between the nanomagnets supports parallel alignment of 
the magnetization between neighbouring magnets. Inthe direction, 
neighbouring nanomagnets are separated by a distance s=40 nm for 
all nanomagnet arrays. 

SEM images of the fabricated arrays are given in Extended Data 
Fig. 2e. The MOKE magnetic characterization is shown in Fig. Ic, with 
the square shapes of the hysteresis loops confirming the single-domain 
magnetic state of the nanomagnets. Since all nanomagnets investigated 
in this study have same magnetic moment, m, = 2.10 x 10° A m’, the 
total magnetic moment of the nanomagnet arrays on an individual 
panel (when all magnetizations are aligned in the same direction) is 


Miotal = Nmagnets!/No (4) 


where Nmagnets is the number of nanomagnets on the panel. For the device 
demonstrated in Fig. 1, a total of 1.040 nanomagnets are fabricated on 
each panel, with m,,,q) = 2.18 x 10°” A m? pointing parallel to the long 
axis of the nanomagnets. 

The square shapes of the hysteresis loops measured using the MOKE, 
shown in Fig. Ic, indicate that all six types of designed nanomagnets are 
fully magnetized at remanence. As shown by the MOKE curvesin Extended 
Data Fig. 2d, the magnetic switching occurs over a relatively small field range 
of 5-20 mT. Since the six transition regions do not overlap, all six types of 
nanomagnets canbe individually programmed, even for amicromachine 
containing arrays of all six nanomagnets oriented in the same direction. 
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Hinge spring design and properties 

Inspired by the art of origami, we have designed rigid panels carry- 
ing single-domain nanomagnets, connected by structured hinge 
springs acting as folding creases. The hinge spring layout needs to 
be designed so that there is a considerable folding behaviour on the 
application of a small magnetic field, B< 15 mT. 

We first derive the relationship between applied magnetic field B and 
panel rotation angle 0. Neglecting the bending of the short sections 
of the spring, the twisting of the zig-zag spring can be determined by 
considering along and slender beam of equivalent length that can twist 
subject to an applied torque. 

Considering first the twist of an isolated beam with uniform cross- 
section along its length (see Extended Data Fig. 3a), the angle of twist 
inradiansis given by 9= s, where Tis the applied torque, / isthe beam 
length, and Gand/Jare the shear modulus and torsional constant of the 
material, respectively. 

Fora beam witha rectangular section, the torsional constant is given 
by 


wt(w? +t?) ( 
patch aie Ma 5) 
J 12 
where w and tare the side lengths. For a homogeneous isotropic 
material, 


E 
C- 3a+y) o 
where Eis the Young’s modulus and vis the Poisson ratio. Hence the 
panel rotation angle is given by: 


24(1+ v)L 
eV a NUNS 
Ewt(w* + t*) 


(7) 


The applied torque is given by t= kO, so that the magnitude of the tor- 
sional spring constant, k, is given by 


Ewt(w? +t?) 
ENS SEE 8 
4s 24(1+ v)L (8) 
For the current design, the total length of sucha beam is L = nl, where 
lis the length of each beam section and nis the total number of beam 
sections. Therefore, for each spring: 


_ Ewt(w?+ t?) 


= 9 
k, 24(1+ v)nl (9) 
If two springs are used and they are in a parallel configuration: 
Ewt(w?+ t? 
Krotal = k,+ k= (10) 


12(1+ v)nl 


Taking into account that the direction of panel rotation 0 is opposite 
to the direction of the mechanical torque induced on the spring, the 
total torque on the spring is given by: 


Tspring = — Kota (11) 
We now determine the extent of the folding in a magnetic field for hinge 
designs with different numbers of turns in our micromachines. For 
this, we have fabricated simple two-panel devices to investigate the 
relationship between the applied magnetic field and the panel rotation 
angle. An SEM image of an eight-turn two-panel device is shown in 


Extended Data Fig. 3b. The fabricated devices have identical arrays of 


nanomagnets (398 nm x 80 nm nanomagnets on the left panel and 
520 nm x 60 nm nanomagnets on the right panel) but the number of 
turns in the hinge spring is varied, with q = 2, 4, 6 and 8; see Extended 
Data Fig. 3c. The total length of each spring is therefore L = nl = (2q+ 1), 
where /=5 umis the length of each section of the spring. 

According to the illustration in Extended Data Fig. 3d, the magnetic- 
field-induced torque on the device is Ty = m x B and its magnitude is: 


Tz = MBsin(90° — 6) = mBcosé (12) 
At equilibrium: 
Tg + Tspring =0 (13) 
Introducing equations (11) and (12) into equation (13) gives: 
mB cos(9) — Kiota9 = 0 
0 m 
COSA Kjotal (14) 
Introducing equations (4) and (10) into equation (14), we obtain: 
0 | Nmagnets!Mol2(1 +u)nl (15) 


cos@ Ewt(w? + t?) 
This equation is the relation between applied magnetic field H 
and panel rotation angle 8, which can be numerically solved. Here 
Nmagnets = 1,175, n=[5, 9, 13, 17] for devices with 2-, 4-, 6-and 8-turn hinges, 
and the lateral dimensions are w = 100 nm, t=50 nm and /=5 pm, as 
used in the experimental micromachines. From equation (4), the total 
magnetic moment on each panel is ™,,,q = 2.47 x 10°” A m*. Shown 
in Extended Data Fig. 3e are optical microscope images of the panel 
rotation for hinged spring designs with different numbers of turns 
following actuation inthe same field, B=5 mT. The experimental dem- 
onstrations and computational results are plotted in Extended Data 
Fig. 3f, g, respectively, and follow similar trends. The panel rotation 
angle is larger in the experiment, which may be partly due to over- 
etching in the final nanofabrication step leading to hinged springs 
that are narrower than in the original design. For the devices shown 
in Figs. 1-3, hinge springs with g=6 turns were used. With the current 
possibilities for magnetic field control of field steps of O0.1mT, wecan 
orient the panels to within -0.1° to 1°, depending onthe panel rotation 
angle in an applied magnetic field. 

We nowestimate the magnitude of the largest torque and force that 
can be generated from the panels patterned with nanomagnets. The 
induced mechanical torque on the panels Thasa linear relationship with 
the magnitude of the applied magnetic field B, since t=m x B. However, 
onincreasing B, the mechanical torque rcannot be infinitely high, since 
the applied magnetic field may switch the nanomagnets. For example, 
if the magnetic field is applied along the long axis of the magnets, then 
it can alter the magnetization if it is larger than the switching field, 
which is 30 mT for the type IV nanomagnets. 

For a given micromachine, there are several panels in different ori- 
entations that change as the magnetic field is applied. We therefore 
assume that we can safely operate a machine with an upper limit to 
the field given by the coercive field B,, that is, the switching field of 
the magnets when applying a field parallel to the magnet long axis. 

We calculate the torque and force that can be generated by a panel 
patterned with type IV nanomagnets. For the four-panel devices and 
their modular assemblies in Figs. 1, 2, there are in total 1,040 nanomag- 
nets on each10 um x 10 pm panel, with m,,,q= 2-18 x10” A m? pointing 
along the long axis of the nanomagnets. In this case, for a30 mT opera- 
tion field, the highest generated magnetic torque is T= (2.18 x 10° A m’) 
x (30 mT) x sin(90°) = 6.54 x 10“ N m. Assuming one edge of the panel 


is hinged, then the highest generated force is f= (6.54 x 10°“ N m)/ 
(10 pm) = 6.54 x 10° N=6.54nN. 

Similarly, for a10 im x 10 pm panel patterned with type I nanomag- 
nets, which have a switching field of 140 mT, the highest generated 
mechanical torque is T= 3.05 x 10° N m, and the highest generated 
force is f=30.5 nN. 

Ithas been shown previously that microrobots can be used to move 
cells by pushing them with forces in the piconewton range”. For mov- 
ing even larger objects, such as a100 pm x 10 pm x 8 pm microbar, it 
has been shown that ~200 pN of force is needed™. Therefore, since 
our machines can supply forces up to several nanonewtons, they are 
capable of manipulating biological objects such as cells. 


Design of nanomagnet arrays 
We engineer the orientation and magnitude of the magnetic moment 
of the arrays of nanomagnets. Here, we can consider a panel consist- 
ing of nanomagnets with p different aspect ratios, each with different 
switching fields B,,; and magnetic moments m,;= M,V;, where i=1, 2, 
..., D. M, is the magnetization of the magnetic material and V,is the 
volume of the nanomagnet. The nanomagnets with magnetic moment 
m,can be fabricated with several different arbitrary orientations ¢,;on 
a panel. Here, /j=1, 2, ..., k, where k; denotes the total number of dif- 
ferent orientations for the nanomagnets with magnetic moment m,. 
Therefore, we obtain the total magnetic moment m,,,q) for all arrays of 
nanomagnets ona given panel 
P k, 

Meotal = eh Dae N, ;M(COSG, ; + ising, ;) (16) 
where nis the quantity of the nanomagnets with magnetic moment m, 
and orientation ¢;,. This total magnetic moment, m,.,q, of hanomagnets 
ona single panel can be programmed with arbitrary magnitude and 
direction by careful adjustment of the magnetic moment m,;= M,V;,, 
orientations ¢,;and quantities n,; of the nanomagnets. 

In our micromachines, both the magnetization M, and the volume 
V, of the nanomagnets are the same for nanomagnets with different 
switching fields, so we can assume that magnitude of the magnetic 
moment m; = M,V;, of all magnets in the array are the same, that is, 
m;=M.V,=MoVo= Mo. Therefore, equation (16) can be simplified to: 
PK 

Myotal= No ie Desi N, (COS, ; + ising, ;) (17) 
In our systems, the nanomagnets only have two orientations, which 
are orthogonal to each other. This means that the magnets in the two 
orthogonal arrays can be magnetized independently by applying a field 
parallel to the long axes of the nanomagnets in one of the arrays. To be 
specific, in our design, the long axis of each magnet is aligned along one 
of two orientations, thatis, parallel to one of the two coordinate axes (x 
ory axis, so k;=2). Therefore, the magnetization of the nanomagnets 
can point along one of four cardinal directions (North, South, East and 
West) so that the orientation of the total magnetic momentis given by 


: Pp 
, SIN@; y din, 


Dp 
COS®, .. din Mx 


g=tan 


(18) 


where n;,,, Ny are the number of i types of nanomagnets with a given 
switching field with the long axis along the x and y axis, respectively, 
so that Q,,,=0,T and P.y= 7 depending on the direction of magnet- 
ization in the nanomagnets. According to equation (18), by changing 
the number of magnets with a given orientation, that is, n;,and njy, it 
is possible to obtain an arbitrary orientation ¢ of the total magnetic 
moment Moral. 

For each panel, there are nanomagnets with p, and p, different 
switching fields along the x and y axis, respectively. It is therefore 
possible to program 2”. x 2?» magnetic configurations, which havea 
total magnetic moment witha particular magnitude and orientation. 


In order to encode a panel with two or more different types of nano- 
magnet, one starts by magnetizing the highest-coercivity magnet, so 
that all magnets on the panel will be magnetized in the direction of 
the field. If required, the lower-coercivity magnets can be oppositely 
magnetized using a magnetic field applied in the reverse direction 
that is sufficiently low to leave the higher-coercivity nanomagnets 
unaffected. 


Further transformations 

In this article, the transformations are inspired by origami, and the 
micromachines are constructed with rigid panels and structured soft 
creases. The rigid panels carry the programmed nanomagnet arrays, 
which provide the mechanical torque for transformations when an 
external magnetic field is applied. Here, the nanomagnets are designed 
to be stadium-shaped, with the remanent magnetization pointinginthe 
plane of the rigid panel along the geometric long axis of the magnets 
as aresult of the magnetic shape anisotropy. 

The current concept has limitations where folding into particular 
shapes is not straightforward. For example, the four-panel devices 
shown in Fig. 1, which have all four side panels magnetized towards 
or away from the centre, can achieve folding into an ‘uncovered box’. 
Shutting the lid to give a ‘closed’ box with an additional panel is not 
possible with the arrangement of nanomagnets used in this work. 
Instead, the magnetization of the nanomagnets on the lid needs to 
point out of the plane of the lid. This requires nanomagnets with out-of- 
plane magnetic anisotropy, which can be achieved with, for example, a 
Co/Pt multilayer system®. 


Micromachine release and operation 

We used organic solvents, such as PGMEA and acetone, to dissolve 
the PMMA supporting layer and release the micromachines from the 
silicon substrate in the final step of the fabrication process. We chose 
PGMEA because it does not evaporate as fast as acetone and, in order 
to keep the experimental processes simple, we directly actuated the 
micromachines in the PGMEA solvent after the release. Nevertheless, 
the micromachines can also be operated in other working environ- 
ments such as water and air and, since the nanomagnets have a3 nm 
Al capping layer, they will not be easily oxidized. This 3 nm Al layer also 
ensures the biocompatibility of the micromachines. Here, we suggest 
three approaches to transfer and operate the structures in a water 
environment: 


Approach 1. Instead of using a PMMA support layer coated on the back 
of the membrane (steps 3-4 in Extended Data Fig. 8), water-soluble 
coatings, such as poly(acrylic acid) or Dextran*®, can be used. These 
are compatible with microfabrication techniques and, after the RIE 
etching, the structures can be directly released and operated in water. 


Approach 2. After the fabrication and the release of the structures in 
an organic solvent, the structures can be fished out and transferred to 
a water environment using a micromanipulator tip. This is a standard 
procedure for micro- and nano-robotic manipulation, and has been 
widely reported”°. 


Approach 3. After the fabrication and the release of the structures 
in an organic solvent, water substitution can be performed, for ex- 
ample, ina Petri dish containing the solvent and the micromachines. 
This technique is widely used in soft or shape-morphing microrobot- 
ics", It should be noted that PGMEA has limited miscibility with water. 
Therefore, if PGMEA is used to dissolve the PMMA for device release, 
acetone or isopropyl alcohol (IPA; fully miscible with PGMEA) can be 
used to substitute PGMEA first, and then a water substitution can be 
performed to replace acetone or IPA. An alternative approach is to 
directly use acetone to dissolve PMMA for device release, followed by 
a water substitution. 
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Using approach 3 with acetone to dissolve the PMMA, we have dem- 
onstrated operation of the four-panel device in water (see Extended 
Data Fig. 9a, b). In addition, we have demonstrated the manipulation 
of 6-~um-diameter polystyrene microbeads (Polybead 15714-5, Poly- 
sciences, Inc.) ina water environment (Extended Data Fig. 9c, d and 
Supplementary Video 8). 

We have also demonstrated the operation of the micromachines in 
air. For this, we have fabricated single panels with one end attached 
by a hinge spring connection to a fixed silicon nitride membrane 
frame. In this case, both the PMMA support layer and the device 
release in PGMEA are not required, and the device is free to be 
manipulated in a magnetic field immediately after the reactive 
ion etching process. As shown in Extended Data Fig. 10 and Supple- 
mentary Video 9, on increasing an applied out-of-plane magnetic 
field, the panels fold upwards or downwards depending on the 
programmed magnetization direction of the nanomagnet arrays 
on each panel. 

It is worth noting that, even without suspension in a solvent, the 
micromachines will not collapse due to the gravitational force. For 
the four-panel micromachine shown in Fig. 1, the weight is the com- 
bined weight of the silicon nitride membrane and the nanomagnets. 
The weight of silicon nitride panels is approximately given by Pin, x 
Vinx X = 7.8 x 10°? N, where psn, = 3.17 g cm’? is the density of silicon 
nitride, Vix, =5 x 10 pm x 10 pm x 50 nm is the volume of the four- 
panel micromachine consisting of 5 rigid panels, and g=9.8 ms is 
the standard gravity. The weight of the nanomagnets is approximately 
Prmagnets * Vagnets X= 6.6 * 10°? N, where Pragnets = 8-9 g cm * is the cobalt 
density, Viragnets = 4 * 1,040 x Vo is the total volume of the nanomagnets 
onthe micromachine with four side panels, each having 1,040 nano- 
magnets, and V,)=1.82 x10 m? is the volume of each individual nano- 
magnet. Therefore the total weight of the four-panel micromachineis 
7.8 x10 N+ 6.6 x 10° N=1.44 pN. The magnitudes of the magnetic 
actuation force and the elastic force from the hinge springs are in the 
nanonewton range (as calculated in the Methods section ‘Hinge spring 
design and properties’), whichis three orders of magnitude larger than 


the gravitational load. Therefore the devices can be manipulated in air 
without structural collapse due to gravity. 


Data availability 


All data generated or analysed during this study are included in the 
published article and its Supplementary Information, and are available 
from the corresponding authors on reasonable request. 
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Extended Data Fig. 1| SEMimages ofa four-panel micromachine. each nanomagnet is 520 nm x 60 nm; at bottom left, the lateral dimension 
a, Overview. b, Enlarged image corresponding to the dashed box ina. Shown of each nanomagnet is 398 nm x 80 nm. Scale bars: a, 4 1m; b, 2 pm. 
are arrays of nanomagnets: in the array at top right, the lateral dimension of 
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Extended Data Fig. 2 | Geometric design and switching behaviour of the 
nanomagnets. a, Schematic of astadium-shaped nanomagnet with length 
L,width dand thickness t. b, Schematic of the layout of the nanomagnet arrays 
with vertical separations, and horizontal separation d/2.c, Relationship 
between dand_ for nanomagnets with the same volume V, and thickness 
t=60nm. Six nanomagnets with different aspect ratios are indicated onthe 
curve (the dimensions of each magnet are indicated in nm); the colour of the 
points corresponds to colour of the hysteresis loops in Fig. 1c. Arrows indicate 


Kerr rotation (a.u.) 
° 
oO 


Derivative 


1 —L 4 
50 100 150 
Applied magnetic field (mT) 


320.*/60 


the four types of nanomagnet used in the micromachines (I-IV). d, Magneto- 
optical Kerr effect curves for the six differently sized nanomagnets in the field 
region where they switch (top panel) and the derivative with the switching 
region highlighted with shaded boxes (bottom panel). As the six switching 
regions do not overlap, all six nanomagnets can be individually programmed. 
e, SEM images of fabricated arrays of nanomagnets with lateral dimensions 
giveninnanometres, corresponding to the six coloured points inc. Scale bar at 
bottom right (1 pm) applies to all six images. 
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Extended Data Fig. 3| Hinge spring design. a, Schematic of asingle section of controlling field. f, Predicted panel rotating angle versus applied magnetic 
aspring. See Methods for nomenclature. b, SEM image of atwo-panel device field based on theoretical calculations of the two-panel devices with different 
with an 8-turn spring. c, Schematic of two-panel devices with 2-, 4-,6-and numbers of turns. g, Measured panel rotating angle versus applied magnetic 
8-turn spring designs. The turquoise and orange arrows represent the field for the fabricated devices with different numbers of hinge spring turns. 
magnetization direction of the panels. d, Schematic of atwo-panel device that Each data point corresponds to the average of three measurements of the angle 
folds when applying a controlling magnetic field B. See Methods for using image analysis software. Error bars, +1s.d. Scale bars: b, 2 1m; e (applies 
nomenclature. e, Optical microscope images of four fabricated devices with toallimagesine), 51m. 


different numbers of turns inthe spring design on application ofa5 mT 


Extended Data Fig. 4| The 16 magnetization configurations of a four-panel 
micromachine and their corresponding shape transformation after 
applying a vertical controlling field. The four background colours highlight 
the family of four distinct conformations demonstrated in Fig. 1c. 
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3 x 3 assembly Optical microscope image 


Extended Data Fig. 5 | Four conformations of a micromachine consisting ofa 
3x3 assembly of four-panel modular units. Shown in the middle and right 
panels are schematics and experimental demonstrations of the actuated 
micromachines. The units ina given micromachineall have the same 
conformation (left panel) corresponding to one of the 16 different 
magnetization configurations. Scale bars in the optical microscope images, 
30pm. 
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Extended Data Fig. 6| Four different modes, ‘P’, ‘x’, ‘O’ and‘9’, encodedin nanomagnet encoding, a single micromachine with this design can transform 
thesame micromachine design. In the conjugate pairs (‘P’ and ‘9’, or ‘X’ and between these four modes. See main text and Methods for details. 
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Extended Data Fig. 7 | Possibilities for the total magnetic moments of the 
wing tip inthe microscale ‘bird’. a, SEM image of the wing tip of the microscale 
bird. Turquoise vertical bar, type II nanomagnets (398 nm x 80 nm); blue 
horizontal bar, type III nanomagnets (358 nm x 90 nm); purple bar (horizontal 
and vertical), type IV nanomagnets (300 nm x 110 nm). Each of the arrays has 
the same number of magnets (1,040) with the same magnetic moment m. 

b, Nine possible total magnetic moment magnitudes and directions. 

c, Schematics of 16 possible magnetic configurations of the wing tip. Each of 
the arrays of different types of nanomagnets (types II, II and IV) have different 
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switching fields, and there are two out of the four arrays that have the same 
typelV magnets but with orthogonal orientation. Therefore, with the 
orientation of the nanomagnets in two of the arrays along thex direction andin 
the two other arrays along the y direction, there are in total 2? x 2?=16 possible 
magnetic configurations that can be encoded into the wing tip. d, Schematics 
showing the magnitudes and directions of the total magnetic moment of the 
wing tip, corresponding to the 16 magnetic configurations shown inc. Scale bar 
ina,4pym. 
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Extended Data Fig. 8 | Schematic of the steps used to fabricate the 
micromachines. The nanomagnets are fabricated using electron beam 
lithography, including patterning of a spin-coated polymer resist, thermal 


evaporation of acobalt thin film and lift off. See Methods section ‘Sample 
fabrication’ for more details about the individual steps. 


Extended Data Fig. 9 | Optical microscope images of a four-panel 
micromachine, demonstrating operation in water and manipulation of 
polystyrene microbeads. The micromachine is released in acetone and thena 
substitution of water for acetone is performed. The magnetization of all four 
panels points towards the centre. a, Micromachine in water without a magnetic 
field. b, The micromachine panels fold up in an applied out-of-plane magnetic 
field B of 10 mT. c,d, On application of a rotating magnetic field B (10 mT, 5 Hz), 


the micromachine rolls across the surface of a silicon wafer, and the rolling 
motion generates a vortex in the water surrounding it (highlighted with blue 
arrows). Polystyrene microbeads of 6 pm diameter (highlighted with red 
arrows) are trapped in the vortex and are transported toa new location. Two 
snapshots of the motion, separated by atime interval of 14 seconds, are shown. 
Scale bars, 40 pm. 
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Extended Data Fig. 10 | Optical microscope images of single-panel 
micromachines operating in air. Two rows of single-panel micromachines are 
shown, each suspended within a D-shaped ‘cutout’ in the silicon nitride 
membrane frame, and connected toit onthe left side by hinge springs with two 
turns. Each panel is 10 pm x 10 pmin size. a, After fabrication, the panels are 
somewhat out-of-focus. This is because they are slightly tilted above the plane 
of the in-focus silicon nitride frame, which may be due to the residual stress in 
the hinge springs. The white arrows indicate the magnetization direction of the 
single-panel micromachines pointing left (top row) and right (bottom row). 
b,c, Onslowly increasing the applied out-of-plane magnetic field, the panels tilt 
downwards (top row) or upwards (bottom row), with the tilt angle increasing as 
the field magnitude is increased. In the optical images, the panels with 
magnetization pointing to the left (top row) first become sharper (b), and they 
almost disappear at atilt angle close to 90° at 35 mT (c). The panels with 
magnetization direction pointing to the right (bottom row) tilt upwards in the 
applied magnetic field, becoming less visible as the field is increased (b), until 
finally disappearing when the tilt angle is close to 90° at 35 mT (c). Scale bar 
(fora-c),20 pm. 
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Two dry surfaces can instantly adhere upon contact with each other through 
intermolecular forces such as hydrogen bonds, electrostatic interactions and van der 
Waals interactions!”. However, such instant adhesion is challenging when wet 
surfaces such as body tissues are involved, because water separates the molecules of 
the two surfaces, preventing interactions®*. Although tissue adhesives have potential 
advantages over suturing or stapling”®, existing liquid or hydrogel tissue adhesives 
suffer from several limitations: weak bonding, low biological compatibility, poor 
mechanical match with tissues, and slow adhesion formation* *. Here we propose an 
alternative tissue adhesive in the form of a dry double-sided tape (DST) made froma 
combination of a biopolymer (gelatin or chitosan) and crosslinked poly(acrylic acid) 
grafted with N-hydrosuccinimide ester. The adhesion mechanism of this DST relies on 


the removal of interfacial water from the tissue surface, resulting in fast temporary 
crosslinking to the surface. Subsequent covalent crosslinking with amine groups on 
the tissue surface further improves the adhesion stability and strength of the DST. 

In vitro mouse, in vivo rat and ex vivo porcine models show that the DST can achieve 
strong adhesion between diverse wet dynamic tissues and engineering solids within 
five seconds. The DST may be useful as a tissue adhesive and sealant, and in adhering 
wearable and implantable devices to wet tissues. 


Existing tissue adhesives—which are in the form of liquids or wet hydro- 
gels— mostly rely on the diffusion of their molecules (for example, 
monomers, macromers or polymers) through the interfacial water 
to form bonds with the polymer networks of tissues> ” (Fig. 1a, b). By 
contrast, animals capable of forming adhesion in wet environments 
commonly possess mechanisms (for example, mussel, barnacle and 
spider-web glues) to remove interfacial water from the contact surfaces 
in order to form bonds” “. Inspired by these examples in nature, we 
have engineered our DST to adopt a dry-crosslinking mechanism to 
remove interfacial water and form adhesion on wet tissues (Fig. 1c, d 
and Extended Data Fig. 1). 

The DST consists of two major components: first, poly(acrylic acid) 
grafted with N-hydroxysuccinimide ester (PAAc-NHS ester) crosslinked 
by biodegradable gelatin methacrylate; and second, biodegradable 
biopolymers (for example, gelatin or chitosan). The negatively charged 
carboxylic acid groups inthe PAAc-NHS ester facilitate the quick hydra- 
tion and swelling of the DST to dry the wet surfaces of various tissues 
under gentle pressure of around1kPa, applied for less than 5 seconds. 
(See Supplementary Information for a quantitative model showing 
how the DST dries interfacial water, and Supplementary Figs. 1-6.) 
Simultaneously, the carboxylic acid groups in the PAAc-NHS ester 
form intermolecular bonds (for example, hydrogen bonds and electro- 
static interactions) with the tissue surfaces (Fig. 1d and Extended Data 


Fig. 1). To provide stable adhesion, the NHS ester groups grafted onthe 
PAAc also couple covalently with primary amine groups on various 
tissues within a few minutes, without the need for further pressure 
(Extended Data Figs. 1, 2). After adhering onto tissues, the swollen DST 
becomes a thin hydrogel layer with an equilibrium water content of 
around 92% by volume (Extended Data Fig. 3). Because the swollen DST 
integrates mechanisms for high stretchability and mechanical dissipa- 
tion’”’, it exhibits a high fracture toughness of more than 1,000 J m” 
(Supplementary Figs. 7, 8), whichis crucial in achieving tough adhesion 
of the swollen DST?”°. 

The dry DST takes the form of a conformable thin film that can be 
applied on non-planar tissue surfaces (Extended Data Fig. 4). Itcan be 
fabricated into diverse shapes such as flat sheets, perforated sheets 
and adhesive-tape-like rolls (Fig. le). The DST, in its fully swollen state, 
exhibits a shear modulus of 2.5—5 kPa and the ability to stretch to more 
than 16 times the original length, capable of mechanically matching soft 
tissues” (Fig. If, g). To remove potentially cytotoxic residual reagents, 
we thoroughly purified the DST during its preparation (Extended Data 
Fig. 5). The in vitro biocompatibility of the DST-conditioned medium 
is comparable to that of the control medium, showing no observa- 
ble decrease in the in vitro viability of mouse embryonic fibroblasts 
(MEFs) after 24-h culture (Fig. 1h). The crosslinkers (that is, gelatin 
methacrylate) for PAAc-NHS ester and the biopolymers (thatis, gelatin 
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Fig.1|Dry DST and dry-crosslinking mechanism for adhesion of wet tissues 
and devices. a, Existing tissue adhesives take the form of liquids or wet 
hydrogels. b, Adhesion formation by these existing adhesives mostly relies on 
the diffusion of monomers, macromers or polymers towards the tissues. c, Our 
proposed tissue adhesive takes the form ofa dry DST. d, The dry-crosslinking 
mechanism for the DST integrates the drying of interfacial water by hydration 
and swelling of the dry DST, temporary crosslinking, and covalent crosslinking 
(the latter involving the formation of covalent bonds between the DST and 
amine groups ontissues).e, The DST can take on various shapes owing to its 


or chitosan) in the DST are biodegradable by endogenous enzymes 
(for example, collagenase, lysozyme or N-acetyl-B-D-glucosaminidase 
(NAGase)) at varying rates. For example, gelatin typically degrades 
more quickly than chitosan under physiological conditions”. Hence, 
the in vitro biodegradation rate of the DST can be controlled over time 
periods from a week (for the gelatin-based DST) to several months 
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high flexibility in fabrication. The DST is coloured witha red food dye for 
visualization. f, g, Photographs (f) and nominal stress versus stretch curve (g) 
for the DST in its swollen state, stretched to more than 16 times the original 
length. The DST is coloured with a red food dye for visualization. A, stretch; 

F, force. h, In vitro biocompatibility of the DST ina live/dead assay of mouse 
embryonic fibroblasts (MEFs) after 24 hours of culture. i, In vitro 
biodegradation of the gelatin-based DST in Dulbecco’s phosphate-buffered 
saline (DPBS) with or without collagenase. Values in panels h, irepresent the 
mean andthe standard deviation (n=3-5). 


(for the chitosan-based DST) by tuning its composition (Fig. li and 
Supplementary Fig. 9). 

To evaluate the adhesion performance of the DST, we conduct three 
different types of mechanical tests, measuring the interfacial tough- 
ness by peel tests, the shear strength by lap-shear tests and the tensile 
strength by tensile tests (according to the following testing standards 
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Fig. 2 | Adhesion performance of the DST. a, Interfacial toughness and shear 
and tensile strength versus pressing time for wet porcine skins adhered using 
the DST with NHS ester. b, Interfacial toughness and shear and tensile strength 
versus time after pressing for wet porcine skins adhered using the DST with 
NHS ester. c, Comparison of adhesion performances of the DST and various 


for tissue adhesives: ASTM F2256 for 180-degree peel tests, ASTM F2255 
for lap-shear tests, and ASTM F2258 for tensile tests; Extended Data 
Fig. 6 and Supplementary Fig. 10). We first choose wet porcine skin 
as the model tissue for evaluation of adhesion performance, owing 
to its mechanical robustness and close resemblance to human skin”. 
The DST can establish tough (with an interfacial toughness of more 
than 710 J m”) and strong (witha shear and tensile strength of more 
than 120 kPa) adhesion between wet porcine skins upon contact and 
application of gentle pressure (1 kPa) for less than 5 seconds (Fig. 2a, 
Supplementary Fig. 11and Supplementary Video 1). The tissues adhered 
by the DST exhibit a relatively small decrease (of less than 10%) in the 
measured interfacial toughness and strength more than 48 h after the 
initial pressing (Fig. 2b). Furthermore, the DST can maintain its ability 


commercially available tissue adhesives. PSA, pressure-sensitive adhesives. 

d, Interfacial toughness and shear and tensile strength between various tissues 
adhered by the DST. e, Interfacial toughness and shear and tensile strength 
between porcine skin and various engineering solids adhered by the DST. 
Values represent the mean and standard deviation (n=3-5). 


to form robust adhesion on wet tissues after being stored for more than 
2 weeks (Supplementary Fig. 12). 

We also examine the importance of covalent crosslinking after 
intermolecular crosslinking on the adhesion performance of the DST. 
We test the adhesion performance of the DST without grafted NHS 
ester on the PAAc, which cannot form covalent crosslinks with tissues 
(Extended Data Fig. 7). Although the DST without NHS ester can provide 
tough (with an interfacial toughness greater than 500J m7) and strong 
(with a shear and tensile strength more than 80 kPa) adhesion upon 
application between wet porcine skins (Extended Data Fig. 7a), the 
adhesion performance shows substantial deterioration over time 
(Extended Data Fig. 7b), owing to the unstable and temporary nature 
of the intermolecular bonds in wet environments?. Hence, both the 
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Fig. 3 | In vivo adhesion, biocompatibility, and biodegradability of the DST. 
a, Adhesion of a DST-TPU hybrid patch ona beating rat heart in vivo. b, The 
DST-TPU patch adhered onthe rat heart 3 days after implantation in vivo. 

c, Aschematic illustration of the section taken for histology (dotted yellow line) 
throughaDST-TPU or asutured TPU patchimplanted on the rat epicardial 
surface. d,e, Representative histological images of the DST-TPU patch (d) and 
the sutured TPU patch (e) stained with haematoxylin and eosin (H&E). 

f-h, Representative histological images of the chitosan-based DST (f), the 


temporary crosslinks and the subsequent covalent crosslinks are 
necessary for stable and robust adhesion of wet surfaces, supporting 
our proposed mechanism (Fig. 1d and Extended Data Fig. 1). 

We further test the adhesion performance of the DST under cyclic 
loading conditions. Two porcine heart tissues adhered by the DST 
maintain a high interfacial toughness of more than 650 J m7? during 
cyclic loading over 5,000 cycles with physiologically relevant strain 
(30% tensile strain) (Supplementary Fig. 13). In addition, the DST can 
provide similarly high interfacial toughness (of more than 640) m~”) 
and shear and tensile strength (of more than 85 kPa) on blood-covered 
porcine tissues after washout with saline” (Supplementary Fig. 14). 

The DST demonstrates superior adhesion performance compared 
with existing tissue adhesives, including commercially available 
cyanoacrylate adhesives (Histoacryl and Dermabond), albumin-based 
adhesives (BioGlue), polyethylene-glycol-based adhesives (Coseal and 
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gelatin-based DST (g) and the Coseal (h), stained with H&E. i,j, Representative 
histological images stained with H&E for assessment of the biodegradation of 
subcutaneously implanted chitosan-based DST (i) and gelatin-based DST (j) after 
2weeks.k, I, Representative histological images stained with H&E for assessment 
of the biocompatibility and biodegradation of subcutaneously implanted 
chitosan-based DST (k) and gelatin-based DST (I) after 4 weeks. SM, GT and FC 
indicate skeletal muscle, granulation tissue and fibrous capsule, respectively. 
Allexperiments were repeated three or four times with similar results. 


DuraSeal), fibrin glues (Tisseel), hydrophilic pressure-sensitive adhe- 
sives (Tegaderm hydrocolloid), as well as nanoparticle solutions’ and 
ultraviolet-curable surgical glues” (Fig. 2c and Extended Data Fig. 8). 
We find that these existing tissue adhesives require a relatively long 
time to form adhesion (longer than 1 min; Extended Data Fig. 8) and 
exhibit limited adhesion performance on wet tissues (with an interfacial 
toughness of less than than 20 J mand a shear and tensile strength 
of less than 45 kPa; Fig. 2c), consistent with published performances’. 
The DST providesa higher interfacial toughness (up to1,150J m7”) and 
shear and tensile strength (up to 160 kPa) than existing tissue adhesives, 
and forms the adhesion in less than 5 seconds (Fig. 2c—-e and Extended 
Data Fig. 8). Although tough hydrogel adhesives can achieve a similarly 
high interfacial toughness of more than 1,000J mon wet tissues, they 
require steady pressure application for substantially longer periods of 
time (5-30 min) on tissue surfaces to form the adhesion”. 
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Fig. 4 | Potential applications of the DST. a, Sealing of an air-leaking lacerated 
ex vivo porcine lung lobe by a hydrogel patch adhered with the DST. b, Sealing 
of a fluid-leaking ex vivo porcine stomach by a hydrogel patch adhered with the 
DST. c, DST-mediated adhesion of a drug-loaded patch ona beating ex vivo 
porcine heart witha cut. d, Diffusion of amock drug (fluorescein) froma DST- 
adhered drug patch into the ex vivo porcine heart tissue over time. e, Adhesion 


The DST can be applied to various wet tissues, including skin, small 
intestine, stomach, muscle, heart and liver (Fig. 2d and Supplemen- 
tary Video 2), with high interfacial toughness (more than 710) m? for 
skin, 580) m7? for small intestine, 450 J m7 for stomach, 570 J m? for 
muscle, 340 J m” for heart and 190 J m” for liver) and high shear and 
tensile strength (more than 120 kPa for skin, 80 kPa for small intestine, 
70 kPa for stomach, 80 kPa for muscle, 70 kPa for heart and 20 kPa for 
liver) (Fig. 2d and Supplementary Fig. 10a-c). The DST canalso provide 
adhesion between wet tissues and various engineering solids, including 
hydrogel, silicon, titanium, polydimethylsiloxane (PDMS), polyimide and 
polycarbonate (Fig. 2e and Supplementary Video 3). We functionalize 
the surfaces of various engineering solids with primary amines in order 
to ensure covalent coupling with the DST” (Extended Data Fig. 9), and 
then evaluate the adhesion performance using wet porcine skin (Sup- 
plementary Fig. 10d-f). The adhesion between the wet tissues and vari- 
ous engineering solids by the DST exhibits high interfacial toughness 
(higher than 1,150 J m? for hydrogel, 800 J m* for silicon, 680) m”? for 
titanium, 480 J m~? for PDMS, 720 J m7 for polyimide and 410 J m for 


DST-hydrogel application 


Damaged 


Pressing 5 seconds 


Air leakage sealing Robust sealing after 12 h 
-hydtegel| er a ee = 
DStT=hy, ee 4 Ser 3 
: ee 2 \ 

— eee 


Sealed lobe. 


Robust 


lobe i li 
: Inflated lung ews 


Sealed 
stomach 


Porcine heart Sou 


Drug diffusion 


Porcine heat — 


~<— Pressure inputs to mimic heart beat 


Time (s) 


ofaDST-strain-sensor hybrid ona beating ex vivo porcine heart. f, Normalized 
electrical resistance (R/R,) of the DST-adhered strain sensor over time, in order 
to measure deformation of the beating heart. The blue shades inthe graph 
indicate the intervals during which pressure inputs are introduced to the ex 
vivo porcine heart to mimic beating. 


polycarbonate) and high shear and tensile strength (more than 80 kPa 
for hydrogel, 160 kPa for silicon, 150 kPa for titanium, 100 kPa for PDMS, 
100 kPa for polyimide and 70 kPa for polycarbonate) (Fig. 2e). 

In order to evaluate the ability of the DST to adhere to wet and 
dynamic surfaces in vivo, we adhere a thermoplastic polyurethane 
(TPU) patch” to the epicardial surface of a rat heart using the DST 
(Fig. 3a, band Supplementary Fig. 15a). We find that a5-mm-diameter 
DST-TPU hybrid patch (using the gelatin-based DST with a dry thick- 
ness of 20 pm) can be adhered to the epicardial surface of a beating 
rat heart after gently pressing for 5 seconds (Fig. 3a and Supplemen- 
tary Video 4). After 3 days of in vivo implantation, the DST-TPU patch 
maintains adhesion to the rat heart surface (Fig. 3b) while producing 
ahost response similar to that of reported epicardial patches (Fig. 3c, 
d). Histological assessment by a blinded pathologist indicates that 
the degree of inflammatory reaction induced by the DST-TPU patch 
is comparable to that of a sutured TPU patch (Fig. 3c-e). 

We further evaluate the in vivo biocompatibility and biodegrada- 
bility of the DST in a rat model of dorsal subcutaneous implantation 
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(Fig. 3f-hand Supplementary Fig. 15b). Histological assessment dem- 
onstrates that, after 2 weeks of implantation, the chitosan-based DST 
(20-um dry thickness) generates acomparable inflammatory reaction 
(Fig. 3f) to that produced by Coseal, a US Food and Drug Administration 
(FDA)-approved, commercially available tissue adhesive (Fig. 3h). The 
histology at 2 weeks for both implant types is characterized by a mild 
to moderate chronic inflammatory response involving macrophages, 
lymphocytes and occasional giant cells in association with the forma- 
tion of acapsule of granulation tissue comprising fibroblasts, collagen 
and new blood vessels. There is no evidence of necrosis of the overlying 
skeletal muscle or skin, or of an eosinophilic response suggestive of an 
allergic reaction. Although the gelatin-based DST shows a higher degree 
ofinflammatory response than the chitosan-based DST, as indicated by 
adenser chronic inflammatory reaction (Fig. 3g), no major damage to 
the surrounding dermal and muscular layers is observed after 2 weeks 
of implantation. The more pronounced inflammatory response of 
the gelatin-based DST might result from the faster biodegradation of 
gelatin than of chitosan and subsequent effects on the surrounding 
tissues, including a higher degree of phagocytotic responses”. 

Furthermore, the histological images of the subcutaneously 
implanted DST demonstrate in vivo biodegradability of the DST 
(Fig. 3i-l). After two weeks of implantation, the relatively slow-degrading 
chitosan-based DST maintains an intact film-like configuration 
(Fig. 3i), while the relatively fast-degrading gelatin-based DST shows 
signs of degradation suchas reduction in thickness and a fragmented 
configuration (Fig. 3j). At an implantation period of 4 weeks, the chi- 
tosan-based DST shows limited signs of degradation (Fig. 3k), whereas 
there is substantial continued degradation of the gelatin-based DST, 
as shown by increased material resorption by macrophages (Fig. 31). In 
addition, there is appropriate evolution and attenuation of the inflam- 
matory response generated by both the gelatin-based and the chitosan- 
based DST after 4 weeks of implantation, including a decrease in the 
magnitude of the chronic inflammatory infiltrate and thinning of the 
surrounding fibrous capsule. 

In order to demonstrate potential applications of the DST, we inves- 
tigate a range of proof-of-principle applications using ex vivo porcine 
models. The DST combined with a degradable tough hydrogel patch 
can form an air-tight sealing of a lacerated, air-leaking lung lobe and 
trachea (Fig. 4a, Extended Data Fig. 10a and Supplementary Video 5). 
Furthermore, it provides a fluid-tight sealing ofa fluid-filled perforated 
stomach (witha 1-cm-wide hole) anda dissected small intestine (Fig. 4c, 
Extended Data Fig. 1Ob and Supplementary Videos 6, 7). We further show 
that the DST can be used to adhere devices onto dynamic and deform- 
able tissues”°”* °°. For example, we use the DST to adhere a hydrogel 
patch with a mock drug (fluorescein) onto a beating ex vivo porcine 
heart (introducing cyclical, pressurized air inputs to mimic heart beats). 
This suggests that the DST might allow the attachment of drug-delivery 
devices onto dynamic wet tissues (Fig. 4d and Supplementary Video 8). 
The adhered DST patch maintains adhesion on the beating heart for 
more than 12h without any sign of decreased adhesion and allows deliv- 
ery of amock drug into the heart tissue (Fig. 4e). As another example, we 
adhere a stretchable strain sensor onthe beating porcine heart (Fig. 4f 
and Supplementary Video 9). The DST allows facile attachment of the 
strain sensor on the dynamic and curved surface of the beating heart, 
as well as electrical measurements of the heart movements (Fig. 4g). 
Notably, the stretchable DST-sensor hybrid is prepared by printing a 
conductive ink on a DST-Ecoflex hybrid substrate”° (Supplementary 
Fig. 16). Such DST-device hybrids could serve as a versatile platform 
for wearable and implantable devices to adhere onto wet and dynamic 
tissues. Although these ex vivo models show possible applications of 
the DST, we note that its long-term efficacy, biocompatibility and bio- 
degradability, as well as the induced biological responses (for example, 
healing) in clinically relevant settings, require further studies. 

Inthis study, we have reported a biologically inspired, dry-crosslink- 
ing mechanism— whichis implemented inthe form of a dry DST—for the 


174 | Nature | Vol575 | 7 November 2019 


adhesion of wet tissues and devices. This DST offers advantages over 
existing tissue adhesives and sealants, including fast adhesion forma- 
tion, robust adhesion performance, flexibility, and ease of storage and 
use. The DST may also provide new opportunities for bioscaffolds, drug 
delivery, and wearable and implantable devices. This dry-crosslinking 
mechanism for the adhesion of wet surfaces could also be applied in 
the design of adhesives for wet and underwater environments. 
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Methods 


Materials 

All chemicals were obtained from Sigma-Aldrich unless otherwise men- 
tioned, and used without further purification. To prepare the DST, we 
used acrylic acid, gelatin methacrylate (gelMA; type A bloom 90-100 
from porcine skin with 60% substitution), acrylic acid N-hydroxysuc- 
cinimide ester (AAc-NHS ester), a-ketoglutaric acid, gelatin (type A 
bloom 300 from porcine skin) and chitosan (75-85% deacetylated). To 
visualize the DST in photographs and microscope images, we used red 
food dye (McCormick) and fluorescein isothiocyanate (FITC)-gelatin 
(Thermo Fisher Scientific). For in vitro biodegradation tests, we used 
Dulbecco’s phosphate-buffered saline (DPBS, with calcium and mag- 
nesium; Gibco), collagenase, lysozyme and NAGase. To prepare the 
degradable tough hydrogel, we used acrylamide, gelatin, gelIMA, and 
Irgacure 2959. For surface functionalization of engineering solids, we 
used (3-aminopropyl) triethoxysilane (APTES) and hexamethyldiamine 
(HMDA). To prepare the stretchable strain sensor, we used Ecoflex 
00-30 (Smooth-On), silicone curing retardant (SLO-JO, Smooth-On) 
and carbon black (Alfa Aesar). All engineering solids were obtained 
from McMaster Carr unless otherwise mentioned. All porcine tissues 
for ex vivo experiments were purchased froma research-grade porcine 
tissue vendor (Sierra Medical). 


Preparation of the dry DST 

The DST was prepared with either gelatin or chitosan. To prepare the gel- 
atin-based DST, we dissolved 30% (w/w) acrylic acid, 10% (w/w) gelatin, 
1% (w/w) AAc-NHSester, 0.1% (w/w) gelIMA and 0.2% (w/w) a-ketoglutaric 
acid in deionized water. The mixture was then filtered with 0.4-j1m ster- 
ile syringe filters and poured ona glass mould with spacers. The DST 
was cured in an ultraviolet light (UV) chamber (284 nm, 10 W power) 
for 20 min and completely dried. The final DST was sealed in plastic 
bags with desiccant (silica gel packets) and stored at -20 °C before use. 
The chitosan-based DST was prepared by replacing 10% (w/w) gelatin 
with 2% (w/w) chitosan. In experiments, we used the gelatin-based DST 
with an as-prepared thickness of 210 um unless otherwise mentioned. 
To prepare the DST in various shapes, we cut a large sheet of DST into 
each design using a laser cutter (Epilog). Polyethylene-coated paper 
was used as a backing for the DST. To aid visualization of the DST, we 
added 0.5% (w/w) of red food dye (for photographs) or 0.2% (w/w) 
FITC-gelatin (for fluorescence microscopy images) into the precursor 
solution before curing. 


Mechanical tests 

Tissue samples stored more than 10 min before mechanical tests were 
covered with an excess of 0.01% (w/v) sodium azide solution (in PBS) 
spray and sealed in plastic bags to prevent degradation and dehydra- 
tion. Unless otherwise indicated, all tissues and engineering solids 
were adhered by the DST after washing out the surfaces with PBS fol- 
lowed by 5 seconds of pressing (with 1 kPa pressure applied by either a 
mechanical testing machine or an equivalent weight). Unless otherwise 
indicated, all mechanical tests on adhesion samples were performed 
24h after initial pressing to ensure equilibrium swelling of the adhered 
DST in wet environments. The application of commercially available 
tissue adhesives followed the manual provided for each product. The 
gelatin-based DST was used unless otherwise noted. 

To measure interfacial toughness, adhered samples with widths 
of 2.5 cm were prepared and tested by the standard 180-degree peel 
test (ASTM F2256) or 90-degree peel test (ASTM D2861) (for inflex- 
ible substrates such as silicon) using a mechanical testing machine 
(2.5 KN load-cell, Zwick/Roell Z2.5). All tests were conducted witha 
constant peeling speed of 50 mm min“. The measured force reached 
a plateau as the peeling process entered the steady state. Interfacial 
toughness was determined by dividing two times the plateau force (for 
a180-degree peel test) or the plateau force (for a 90-degree peel test) 


by the width of the tissue sample (Extended Data Fig. 6a). Poly(methyl 
methacrylate) films (with a thickness of 50 pm; Goodfellow) were 
applied using cyanoacrylate glue (Krazy Glue) as a stiff backing for 
the tissues and hydrogels. 

To measure shear strength, adhered samples with an adhesion area of 
width 2.5 cm and length 1cm were prepared and tested by the standard 
lap-shear test (ASTM F2255) with a mechanical testing machine (2.5 kN 
load-cell, Zwick/Roell Z2.5) (Extended Data Fig. 6b). All tests were con- 
ducted with a constant tensile speed of 50 mm min“. Shear strength 
was determined by dividing the maximum force by the adhesion area. 
Poly(methyl methacrylate) films were applied using cyanoacrylate glue 
to act asa stiff backing for the tissues and hydrogels. 

To measure tensile strength, adhered samples with adhesion areas 
of width 2.5 cm and length 2.5 cm were prepared and tested by the 
standard tensile test (ASTM F2258) with a mechanical testing machine 
(2.5 KN load-cell, Zwick/Roell Z2.5) (Extended Data Fig. 6c). All tests 
were conducted with a constant tensile speed of 50 mm min“. Tensile 
strength was determined by dividing the maximum force by the adhe- 
sion area. Aluminium fixtures were applied using cyanoacrylate glues 
to provide grips for tensile tests. 

To characterize mechanical properties of the swollen DST, the DST 
was equilibrated in PBS before tests. The tensile properties and fracture 
toughness of the DST were measured using pure-shear tensile tests of 
thin rectangular samples (10 mm in length, 30 mmin widthand0.5 mm 
in thickness) with a mechanical testing machine (20 N load-cell, Zwick/ 
Roell Z2.5). All tests were conducted with a constant tensile speed of 
50mm min“. The fracture toughness of the DST was calculated using 
a reported method based on tensile tests of unnotched and notched 
samples”. 

To characterize the adhesion performance of the DST under cyclic 
loading, two porcine heart tissues were adhered by the DST with an 
adhesion area of width 2.5 cm and length 4 cm. Each side of the adhered 
tissues was cyclically stretched at 30% tensile strain (with respect to 
the DST length) using a mechanical testing machine (2.5 kN load-cell, 
Zwick/Roell Z2.5) to provide cyclic shear loading to the adhesion inter- 
face (Supplementary Fig. 13). Interfacial toughness between the heart 
tissues adhered by the DST was measured at different cycle numbers 
by the standard 180-degree peel test (ASTM F2256). During the cyclic 
tests, a 0.01% (w/v) sodium azide solution (in PBS) was sprayed onto 
the heart tissues to avoid tissue degradation and dehydration. 


Preparation of engineering solids 

To prepare degradable tough hydrogels for adhesion tests of engineer- 
ing solids, we dissolved 20% (w/w) acrylamide, 10% (w/w) gelatin, 0.2% 
(w/w) gelMA and 0.2% (w/w) Irgacure 2959 in deionized water. The 
mixture was then filtered with 0.4-pm sterile syringe filters and poured 
onaglass mould with spacers. The hydrogels were cured ina UV cham- 
ber (284 nm, 10 W power) for 60 min. To facilitate covalent coupling 
with the DST, engineering solids except hydrogel were functionalized 
with primary amines (Extended Data Fig. 9). For silicon, titanium and 
PDMS, the substrates were first treated with oxygen plasma for 2 min 
(30 W power, Harrick Plasma) to activate the surface. Subsequently, the 
plasma-treated substrates were covered with APTES solution (1% (w/w) 
APTES in 50% ethanol) and incubated for 3 h at room temperature”. 
The substrates were then thoroughly washed with isopropyl alcohol 
and dried using a nitrogen flow. For polyimide and polycarbonate, the 
substrates were immersed into the HMDA solution (10% (v/v) in deion- 
ized water) for 24h at room temperature. The substrates were then thor- 
oughly washed with deionized water and dried using a nitrogen flow”. 


In vitro biocompatibility tests 

We conducted in vitro biocompatibility tests using DST-conditioned 
medium for cell culture”. To prepare the DST-conditioned medium for 
in vitro biocompatibility tests, we incubated 20 mg of the gelatin-based 
DST in1 ml of Dulbecco’s modified Eagle medium (DMEM) at 37 °C 
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for 24 h. Pristine DMEM was used as a control. Wild-type MEFs were 
plated in 96-well plates (n=10 for DST-conditioned medium; n=10 for 
DMEM). The cells were then treated with the DST-conditioned medium 
and incubated at 37 °C for 24 hin 5% CO.,. Cell viability was determined 
with alive/dead viability/cytotoxicity kit for mammalian cells (Thermo 
Fisher Scientific) by adding 4 uM calcein and ethidium homodimer-1 
into the culture medium. We used a confocal microscope (SP 8, Leica) 
to image live cells with excitation/emission at 495 nm/515 nm, and dead 
cells at 495 nm/635 nm. 


In vitro biodegradation tests 

We carried out in vitro biodegradation tests of the DST using enzy- 
matic degradation media as described”. To prepare in vitro enzymatic 
biodegradation medium for the gelatin-based DST, we added 5 mg col- 
lagenasein100 ml DPBS. To prepare in vitro enzymatic biodegradation 
medium for the chitosan-based DST, we added 5 mg collagenase, 5 mg 
lysozyme and 10 ul of 1 mg mI NAGase aqueous solution in 100 ml 
DPBS. The DST was cut into small samples (of width 10 mm and length 
10 mm) and accurately weighed. Before immersion in the enzymatic 
media, the samples were sterilized in 75% ethanol for 15 min and washed 
three times with DPBS. Each sample was then immersed in 15 ml of the 
enzymatic medium within glass scintillation vials and incubated at 37 °C 
with shaking at 60 r.p.m. About 0.01% (w/v) sodium azide was added 
into the enzymatic media to prevent growth of any microorganisms 
during the tests. At each time interval, the DST was removed from the 
incubation medium, exhaustively washed with deionized water and 
lyophilized. Weight loss was determined as the percentage ratio of 
the mass of the lyophilized sample at each time interval, normalized 
by the dry mass of the original lyophilized sample. 


In vivo adhesion, biocompatibility and biodegradability tests 

All animal procedures were reviewed and approved by the Massachu- 
setts Institute of Technology Committee on Animal Care. Details of 
surgical procedures and in vivo data analysis are provided in the Sup- 
plementary Information. 


Preparation of the DST-strain-sensor hybrid 

We prepared the DST-strain-sensor hybrid by printing a conductive ink 
ontoaDST-elastomer hybrid substrate. This elastomer substrate was 
first prepared by casting Ecoflex 00-30 resin mixture (part A and part 
Bina1/1 volume ratio) into a laser-cut acrylic mould. Subsequently, a 
thin layer of gelatin-based DST (100-pm dry thickness) was introduced 
onthe bottom side of the Ecoflex substrate according to the reported 
protocol for hydrogel-elastomer hybrids”. The strain sensor was fab- 
ricated by printing the conductive ink onto the DST-Ecoflex hybrid 
substrate using a custom direct ink writing (DIW) 3D printer®. Briefly, 
the conductive ink was prepared by mixing 10% (w/w) carbon black 
and 1% (w/w) silicone curing retardant into Ecoflex 00-30 resin (part 
Aand part B ina 1/1 volume ratio) using a planetary mixer (AR-100, 
Thinky). The printing paths were generated through production of 
G-codes that control the XYZmotions of a robotic gantry (Aerotech). We 
used a pressure-based microdispenser (Ultimus V, Nordson EFD) with 
a200-um-diameter nozzle (Smoothflow tapered tip, Nordson EFD) to 
print the conductive ink on the substrate through a custom LabVIEW 
interface (National Instruments). Deformation-induced changes inthe 
electrical resistance of the strain sensor were monitored witha digital 
multimeter (34450A, Keysight). 


HPLC characterization of the DST 

We analysed the residual monomer contents of the DST using analytical 
high-performance liquid chromatography (HPLC; Model 1100, Agilent). 
We used 0.1% phosphoric acid as the mobile phase, extractant and 
medium for anacrylic acid monomer standard solution as described** 
(Extended Data Fig. 5). To extract the residual monomer from the DST, 
we incubated 100 mg of the DST in 20 ml of the extractant for 24 h with 


stirring. After the extraction, the solution was filtered with a sterile 
0.2-uum syringe filter and injected into the HPLC system for analysis. 
The concentration of the residual acrylic acid monomer in the DST 
was determined on the basis of the calibration curve obtained from 
the standard solution diluted with the mobile phase to varying mono- 
mer concentrations. 


FTIR characterization of the DST 

The chemical composition of the DST was characterized using atrans- 
mission Fourier transform infrared spectroscope (FTIR 6700, Thermo 
Fisher) with a germanium-attenuated total reflectance (ATR) crystal 
(55 degrees). The FTIR spectrum of the DST was analysed as described®”* 
(Extended Data Fig. 2a). 


Ex vivo tests 

Allex vivo experiments were reviewed and approved by the Committee 
on Animal Care at the Massachusetts Institute of Technology. To assess 
sealing of damaged trachea, we made a laceration (1.5 cmin length) in 
a porcine trachea using a razor blade. Air was then applied through 
tubing connected tothe upper part of the trachea (25 mm Hg pressure) 
to visualize air leakage from the trachea submerged ina water bath. To 
seal the laceration, we adhered a hydrogel patch (of width 2.5 cm and 
length 5 cm) to the damaged trachea using the DST with 5 seconds of 
pressing. The sealed porcine trachea was kept for 12 h at room tem- 
perature with continuous inflation—deflation cycles to monitor the 
DST-based sealing. We added 0.01% (w/v) sodium azide into the water 
bath to avoid tissue degradation. 

Toassess sealing of adamaged lung lobe, we madea laceration (3cm 
long) ina porcine lung lobe with a razor blade. Air was then applied 
through tubing connected tothe upper part of the trachea (25 mm Hg 
pressure) in order to visualize air leakage from the lung lobe submerged 
in the water bath. To seal the laceration, we adhered a hydrogel patch 
(of width 2.5 cm and length 5 cm) to the damaged lung lobe using the 
DST with 5 seconds of pressing. The sealed porcine lung lobe was kept 
for 12 hat room temperature with continuous inflation-deflation cycles 
to monitor the DST-based sealing. We added 0.01% (w/v) sodium azide 
into the water bath to avoid tissue degradation. 

To assess sealing of damaged stomach, we punched a 10-mm- 
wide hole in a porcine stomach. A tube with flowing water was then 
connected to the upper part of the stomach to visualize fluid leakage 
fromthe stomach. To seal the hole, we adhered a 40-mm-wide hydro- 
gel patch onto the damaged stomach using the DST with 5 seconds of 
pressing. The sealed porcine stomach was kept for 12 h at room tem- 
perature to monitor the DST-based sealing. We sprayed 0.01% (w/v) 
sodium azide solution (in PBS) onto the porcine stomach to avoid tis- 
sue degradation. 

To assess sealing of an anastomosis site in a small intestine, we 
dissected a porcine small intestine into two pieces. Anastomosis 
of the dissected small intestine was made by approximating each 
edge of the small intestine followed by wrapping of the DST (2.5cm 
wide and 8 cm long) and 5 seconds of pressing around the approxi- 
mated edges. To check that the DST had produced fluid-tight sealing, 
we applied water to the anastomosed small intestine at 60 mm Hg 
pressure using a microdispenser. We sprayed 0.01% (w/v) sodium 
azide solution (in PBS) onto the porcine small intestine to avoid tissue 
degradation. 

To assess the adhesion of a drug-delivery device, we introduced acut 
(4cmin length) onan explanted porcine heart. The aorta was connected 
to tubing, and programmed pressurized air inputs were introduced 
into the heart using a microdispenser to mimic heart beats. To prepare 
the drug-delivery device, we added 0.5% (w/w) fluorescein sodium 
salt as amock drug into a hydrogel patch (2.5 cm in width and 5 cmin 
length). The drug-loaded hydrogel patch was then stretched to fit the 
cut and adhered onto the beating porcine heart with the perforated 
DST. The adhered drug patch on the beating heart was kept for 12h 


at room temperature with continuous beating to allow diffusion of 
the mock drug into the heart tissue. The diffusion of the mock drug 
was imaged using a fluorescence microscope (LVIOOND, Nikon). To 
assess the adhesion ofa strain sensor, we adhered the DST-strain-sensor 
hybrid onto the beating porcine heart after removing the backing. The 
adhered strain sensor on the beating heart was kept for 12 h at room 
temperature with continuous beating, and then connected with the 
digital multimeter to monitor the deformation of the beating heart. 
All devices were adhered onto the beating heart after washing out 
the surfaces with PBS, followed by 5 seconds of pressing. To prevent 
dehydration and degradation during experiments of longer than1h 
in ambient conditions, we covered the heart with a wet towel soaked 
with 0.01% (w/v) sodium azide solution (in PBS). 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 
All dataare available in the main text or the Supplementary Information. 
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Extended Data Fig. 1| The overall process of DST application. The DST canbe 
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Extended Data Fig. 4| Properties of the dry DST. a, The DST is initially prepared as a thin dry film that can conform to tissue surfaces. b, Nominal stress versus 
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Extended Data Fig. 5 | Quantification of residual monomer inthe DST by HPLC. a, Standard calibration curve of acrylic acid for HPLC. b, Results of HPLC 
characterization of the DST extraction solution. The DST has a very lowconcentration of residual acrylic acid monomers: 74 ng per 1mg of the DST. 
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Extended Data Fig. 7 | Effect of covalent crosslinks on the long-term stability 
of adhesion by the DST. a, Interfacial toughness and shear and tensile strength 
versus pressing time for wet porcine skins adhered by the DST without NHS 


ester. Note that these adhesion tests were performed immediately after the 
initial pressing. b, Interfacial toughness and shear and tensile strength 
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Extended Data Fig. 8 | Adhesion performances of the DST and various 
existing tissue adhesives. Shown are typical values for interfacial toughness, 
shear and tensile strength, and application time required for adhesion 
formation, for the DST (adhered between hydrogel and porcine skin) and 
various existing tissue adhesives. The interfacial toughness, shear strength 
and tensile strength for all commercially available adhesives were measured 


according to the application manual provided for each product. The 
application time for commercially available adhesives was based onthe 
application manuals provided. The data for ultraviolet-curable surgical glue”*, 
nanoparticle solution’ and tough hydrogel adhesive” were obtained from the 
literature. Values represent the mean and the standard deviation (n=3-5). 
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Extended Data Fig. 10 | Sealing of ex vivo porcine trachea and small intestine by the DST. a, Sealing of an air-leaking, lacerated ex vivo porcine trachea bya 
hydrogel patch adhered by the DST. b, Anastomosis of a dissected ex vivo porcine small intestine by the DST. 
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Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size Ex vivo experiments on porcine organs and tissues were conducted to evaluate adhesion performance of the DST. The appropriate sample size 
(n=3-5) was used for each test. 
In vivo experiments on rat were conducted to investigate in vivo biocompatibility and biodegradability based on histological assessment after 
implantation. The appropriate sample size (n=3) was used to evaluate biocompatibility and biodegradability of each sample. 


Data exclusions No data was excluded for ex vivo experiments. 
The following rats were also excluded: Animals that did not survive the surgery, animals that showed infection or opened the sutured incision, 
and animals with defective samples. 


Replication Ex vivo studies for mechanical characterization of the DST were reliably reproduced. The average and standard deviation were reported for 
each test. 
In vivo studies for biocompatibility and biodegradability were reliably reproduced based on similar histological assessment for each case by 


the blinded pathologist. Adhesion performance was compromised when sample was defective due to expired chemical. 


Randomization No formal randomization was used but surgeries were carried out on groups, which were alternated. Each group was completed over 2-3 
different surgery days. 


Blinding All histological assessments were conducted by the blinded pathologist without informing type or group of samples. 
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Laboratory animals For ex vivo studies, porcine organs and tissues were purchased and used from Sierra Medical Inc. (Whittier, CA). 
For in vivo studies, female Sprague Dawley rats, aged by weight (225-275g), were purchased from Charles River Laboratories 
(Wilmington, MA). 


Wild animals This study does not involve wild animals. 
Field-collected samples This study does not involve field-collected samples. 
Ethics oversight Both ex vivo and in vivo animal procedures were reviewed and approved by the Massachusetts Institute of Technology 


Committee on Animal Care (CAC). 
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Supramolecular soft crystals are periodic structures that are formed by the 


hierarchical assembly of complex constituents, and occur ina broad variety of ‘soft- 
matter’ systems’. Such soft crystals exhibit many of the basic features (suchas three- 
dimensional lattices and space groups) and properties (such as band structure and 
wave propagation) of their ‘hard-matter’ atomic solid counterparts, owing to the 
generic symmetry-based principles that underlie both?*. ‘Mesoatomic’ building 
blocks of soft-matter crystals consist of groups of molecules, whose sub-unit-cell 
configurations couple strongly to supra-unit-scale symmetry. As yet, high-fidelity 


experimental techniques for characterizing the detailed local structure of soft matter 
and, in particular, for quantifying the effects of multiscale reconfigurability are quite 
limited. Here, by applying slice-and-view microscopy to reconstruct the micrometre- 
scale domain morphology of a solution-cast block copolymer double gyroid over 
large specimen volumes, we unambiguously characterize its supra-unit and sub-unit 
cell morphology. Our multiscale analysis reveals a qualitative and underappreciated 
distinction between this double-gyroid soft crystal and hard crystals in terms of their 
structural relaxations in response to forces—namely a non-affine mode of sub-unit- 
cell symmetry breaking that is coherently maintained over large multicell dimensions. 


Subject to inevitable stresses during crystal growth, the relatively soft strut lengths 
and diameters of the double-gyroid network can easily accommodate deformation, 
while the angular geometry is stiff, maintaining local correlations even under strong 
symmetry-breaking distortions. These features contrast sharply with the rigid lengths 
and bendable angles of hard crystals. 


Three-dimensional (3D) tomographic imaging is the definitive experi- 
mental technique for determining the morphology of complex nano- 
structures. Tomography can be performed with a variety of microscopic 
methods, provided that there is a suitable match between the imaging 
resolution and the feature size. For block copolymer (BCP) structures in 
which periodicities are typically inthe 10-100-nm regime, with domain 
features on the scale of 2-20 nm, electron microscopy techniques are 
generally required. To date, nearly all 3D tomograms of bulk-phase 
BCPs have been made using transmission electron microscopy (TEM) 
tomography* ©. Although powerful, this technique is quite limited with 
regard to the range of sample thicknesses, and inevitably incurs both 
sample deformation from microtomy and information loss associated 
with the restriction of tilt angles’. 

Here we use the considerable advantages afforded by a wholly dis- 
tinct approach to the tomography of nanostructured 3D morphologies: 


slice-and-view scanning electron microscopy (SVSEM; also named 
focused ion beam scanning electron microscopy, or FIB-SEM)*”°. Cru- 
cially, by comparison with TEM tomography, SVSEM tomography can 
provide a much larger reconstruction in all three spatial dimensions, 
facilitating 3D analysis of volumes many orders of magnitude larger 
than those of the typical unit cell and allowing 3D fast Fourier transform 
(FFT) from selected volumes within the overall reconstruction. We 
study a polystyrene-polydimethylsiloxane (PS-PDMS) double-gyroid 
BCP. The double gyroid is composed of two independent, interpenetrat- 
ing enantiomorphic tubular networks of one type of block (PDMS), 
separated by a slab-like domain™” (whose shape is loosely approxi- 
mated by the G minimal surface’) that is constituted by the second, 
majority block (PS). Although a double gyroid would nominally be 
classified as cubic (cDG; space group of /a3d) in accordance with equi- 
librium theories of BCPs"*’, a more critical analysis of the morphology 
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reveals that the slow solution-cast material organizes into distinct 
triclinic variants of the cubic phase. 

A3D rendering of a2.70 x 2.70 x 0.64 um’ volume containing about 
2,000 unit cells from within a large double-gyroid monodomain is 
shown in Fig. 1a. The two constituent tubular PDMS networks (shown 
in red and blue)—which each enclose about 20% of the total volume— 
are defect-free, consistent with their disjointed segmentation. Prior 
standard small-angle X-ray scattering (SAXS) and TEM analysis indi- 
cated that this sample has acDG structure”; however, careful analysis 
of the morphology throughout the ‘ultra-large’ volumes accessible to 
3D SVSEM reconstruction reveals a decidedly non-cubic unit-cell sym- 
metry. Figure 1b shows an experimental reconstructed 2 x 2 x 2 unit-cell 
volume, including triclinic unit-cell parameters. The symmetry of the 
particular cells in this grain deviates greatly from cubic, with the largest 
and smallest lattice parameters differing by 12% (for example, 130 nm 
versus 116 nm) and with pairs of translation vectors deviating by upto 14° 
from orthogonality. The synchronized slices normal to the [001] direc- 
tion from the experimental reconstruction and from the corresponding 
deformed self-consistent field (SCF) double-gyroid model are compared 
in Supplementary Video 1. As shown in Extended Data Fig. 1, unit-cell 
parameters exhibit only small deviations throughout a given many-cubic 
micrometre-scale grain. Other grains exhibit distinct triclinic variants. 
These triclinic variants deviate from cubic symmetry by up to about 
20% in both length and angle. We denote this morphology as ‘variable 
triclinic double gyroid’ (vtDG) in order to indicate unit cells that are 
essentially coherent within grains, but vary substantially from grainto 
grain. As shown in Extended Data Fig. 2, directions and magnitudes of 
deviations from cubic symmetry in distinct regions of the sample are 
uncorrelated with slicing directions, ruling out the possibility that the 
measured anisotropy is an artefact of SVSEM imaging or reconstruction. 

Structural symmetries can also be assessed using 3D FFT of the SVSEM 
data. The detailed distribution of intensity from a particular (hk) Bragg 
plane depends onthe orientation and spacing distributions of the (hk/) 
planes within the volume of the sample transformed. We use selected 
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Fig. 1|Supra-unit cell structure of the PS-PDMS 
double-gyroid tubular network phase. a, Real-space 
SVSEM reconstruction of PDMS domains (rendered 
red and blue), corresponding to about 2,000 unit cells 
of the (non-cubic) double gyroid. The white box at the 
bottom right highlights the size of a2 x 2 x 2 unit-cell 
subvolume. b, A magnified view showing a different 
triclinic 2 x 2 x 2 volume cropped from within the large 
volume shown in panela. The binarized (PS versus 
PDMS) raw SVSEM voxels are further divided into red 
and blue PDMS voxel networks, with the PS domains 
rendered transparent. The triclinic unit cell 
parameters (a, b,c, a, Band y) measured from real 
space are also shown. c, Rendering of the 3D FFT 
analysis of aregion containing approximately 160 unit 
cells, with Bragg-like spots highlighted as quasi- 
spheriodal volumes, showing how the data intersect 
witha (100) plane. The central 000 peakis indicated as 
ared spot. See Supplementary Video 2 for an animated 
view of the 3D diffractogram. d, 2D logarithmic 
intensity plot of a(100) section from the 3D Fourier 
data. In addition to the allowed {220},5¢, {40O}.no, 
{420}.p, and {440},,, reflections (indexed in white), 
there are {110}.p¢, {200},p,, {310} pg and {530} ing 
reflections that would be forbidden in the cubic 
double gyroid (indexed in red). 
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Fig. 2 | Sub-unit cell IMDS curvature and distance metrics. a, Bragg-filtered 
reconstruction of the IMDS for one unit cell (for a description of the IMDS 
isosurface, see Methods). b, Graph plotting the normalized Gaussian (K) and 
mean (H) curvatures of the IMDS and their respective probability functions (P), 
based onaregion of 35 unit cells. The curvatures are normalized by <D>=130nm 
(the average lattice parameter cell dimension measured by SAXS). The diagonal 
and vertical dashed blue lines indicate the curvature distributions for aconstant 
minimal Gsurface thickness (CMT) IMDS and aconstant mean curvature (CMC) 
IMDS. c, The same unit cell as in panel a but with the IMDS made semi- 
transparent, revealing the two skeletal graphs that are found by athinning 
algorithm (see Methods). d, Distance distributions for the minority and majority 
domains. The minority-block thickness is measured fromthe IMDS tothe closest 
distance to the skeletal graph. The majority-block thickness is measured as half 
the distance froma point onthe red IMDS to the closest point onthe blueIMDS. 


Fig. 3 | Sub-unit cell length and angular metrics. a, Region of roughly 100 unit 
cells, wherea topological thinning of the segmented SVSEM tomogram has been 
used to create the two skeletal graphs of the double-gyroid structure. The 
viewing direction is approximately [100]. b, Asmall region from the theoretical 
cubic unit cell, depicting the skeletal graph and the surrounding IMDS, 
highlighting an internode strut with one cubic unit cell (top) and its dihedral 
rotation when viewed along the strut (bottom). The solid dark red central IMDS 
piece contains two nodes of the graph, and by viewing along the strut 
connecting the nodes, one can measure the dihedral angle, 6,.c, Polar plot of 
the dihedral angle for the red (<9,>=+70.9°) and blue (<6, >=~-70.8°) 
experimental networks, showing the narrow distribution of angles ineach 


volume diffraction (SVD)—the SEM analogue of the selected area dif- 
fraction (SAD) used in TEM analysis—in order to choose the location, 
shape and size of the volume to be transformed from within the larger 
reconstructed sample volume (see Methods). The most intense allowed 
cubic /a3d reflections—the {211} and {220} families—are used to define 
atransformation matrix that fits a triclinic lattice in order to maximize 
the overall intensity values at the deformed reciprocal lattice points 
(Fig. 1c). Measurement of the distorted reciprocal cell parameters from 
SVD are in perfect correspondence with the real-space measurements. 
Notably, inspection of the 3D SVD pattern shows intensity spots located 
at ‘symmetry-forbidden’ Bragg positions if indexed with the cubic /a3d 
space group. Forbidden reflections (see, for example, Fig. 1d) become 
allowed when distortions break the centring translation, screw and glide 
symmetries of the cubic structure. Although the experimental intensi- 
ties of the most prominent of these ‘forbidden’ reflections are two to 
three orders of magnitude below those of the {211} reflections, they are 
10-10° greater in magnitude than the ‘symmetry-allowed’ {321} and 
{400} reflections. The occurrence of these relatively strong ‘forbidden’ 
reflections indicates that the vtDG morphology does not simply cor- 
respond to an affine deformation of the cDG morphology, but rather 
tothe non-affine rearrangement of the morphology at the sub-unit-cell 
scale. We note that distortions (attributed to solvent shrinkage forces) 
have resulted in the appearance of forbidden reflections in prior SAXS 
studies of double-gyroid structures in both bulk and thin-film BCPs”’ ”°. 

We also analyse the sub-unit-cell morphology of the PS-PDMS double 
gyroid, first focusing on the shape of the intermaterial dividing surface 
(IMDS) and on domain thicknesses. Although the resolution of the raw 
SVSEM tomogram is limited by the roughly 3-nm width of image voxels 
(as seen in Fig. 1b), the intragrain coherence of 3D morphology over 
large multicell volumes enables quantitative analysis of the ‘average’ 
unit cell at higher resolution. This is accomplished by Fourier averag- 
ing of the raw greyscale SVSEM data through application of a3D Bragg 
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network. d, A portion of the skeletal graph from panela, withthe struts coloured 
according to their length, showing a factor of roughly three in variability. 

e, Spherical plot of the strut lengths versus their orientation in the laboratory 
frame. f, The same data projected ontoa Mercator plot, where Oand @are the 
polar and azimuthal angles, respectively, of the strut orientation as shownine. 
The <110> strut directions of acDG lattice are shownas red circles. g, Inverse 
correlation between strut length and mean ‘tube’ radius of the IMDS measured 
at the midpoint along the strut (with red and blue colours indicating two distinct 
networks), showing the transverse contraction (dilation) upon length stretching 
(compression) of tubular struts. 


filter (see Methods)—an approach that is well established in 2D high- 
resolution TEM”. An isosurface constructed from the Bragg-filtered 
SVSEM data shows the two IMDS regions, each containing PDMS and 
enclosing roughly 20% of the unit volume (Fig. 2a), allowing measure- 
ment of the mean (H) and Gaussian (K) curvature distributions (Fig. 2b). 

Heuristically, we can compare this experimental H versus K distri- 
bution with two limiting theoretical geometries: a constant matrix 
thickness (CMT) surface, which is surface displaced (normally) by a 
constant from the G minimal surface’; or a constant mean curvature 
(CMC) surface”. Although a CMC shape has been suggested” on the 
grounds that it minimizes the IMDS area for a fixed volume fraction”, 
the CMT surface minimizes the entropic penalty of variable stretch- 
ing of the majority component at the expense of a slight increase in 
the interfacial area. The curvature distributions for mathematical 
cDG surfaces of both types are shown in Extended Data Fig. 3a, b. The 
curvature distribution of aCMC surface is localized to a vertical band 
HD =2.23, while that ofa CMT surface follows Steiner’s linear relation- 
ship, HD=-(t/2D)KD? or HD=-0.103KD? (where Dis the cell repeat length 
and tis the constant thickness of the slab-like matrix domain). Relative 
to these reference surfaces, the experimental curvature distribution is 
closer to the CMC distribution. Also shown in Extended Data Fig. 3c-f 
arethe Hand K distributions for SCF theoretical calculations of cDG as 
a function of segregation strength. Although the IMDS shape of these 
model equilibrium states is always intermediate to the CMC and CMT 
shapes in terms of the curvature distribution, we note that the shapes 
are relatively CMC-like in weakly segregated gyroids and trend towards 
CMT-like when interblock repulsions are increased. This observation, in 
combination with the CMC-like distribution measured in Fig. 2b, might 
suggest that the experimental IMDS shape is inherited froma state in 
which the PS domain vitrifies and fixes the shape of the ordering struc- 
ture during solvent evaporation. It remains far from clear, however, if 
and how closely the shape of the partially solvated and non-equilibrium 
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Fig. 4| Models of sub-unit cell morphology. a—c, Orthographic views of 
portions of skeletal graphs for cubic (a), affine triclinic (b) and non-affine 
triclinic (c) models derived from equilibrium SCF calculations with cubic or 
triclinic symmetry. Here the colour indicates the strut length between nodes, 
normalized by the mean cell repeat length D. d, Probability distributions for the 
strut length/mean cell repeat length computed for affine and non-affine triclinic 


double-gyroid morphology can be modelled by an equilibrium theory 
for neat diblocks. We note that the IMDS shape of aBCP double gyroid 
has been analysed previously’ ©, yet resolution limitations for the IMDS 
curvature of the TEM tomography measurement made identification 
of the surface shape signature inaccessible. 

The inhomogeneous geometry of the double gyroid implies a het- 
erogeneous distribution of domain thicknesses””*, corresponding to 
variable degrees of chain extension throughout the structure. To charac- 
terize the variable thickness of the minority domains, we derivea skeletal 
graph from the SVSEM reconstruction; this graph consists of 1D struts 
threading through geometric ‘centres’ of tubular domains and meeting 
at threefold junctions (Fig. 2c)”*. We define a ‘minor block thickness’ as 
the shortest distance froma point on the IMDS to the interior skeleton, 
while a ‘major block thickness’ is half of the shortest distance froma 
point on oneIMDS to the IMDS of the opposing network (Fig. 2d, inset). 
Distributions of the minor (PDMS) and major (PS) block thicknesses are 
shownin Fig. 2d. Previous explanations for double-gyroid formation in 
BCP have emphasized that the constraints involved in packing polymer 
blocks at constant density require variation in the stretch length, most 
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models. e, Probability distributions for internode angles for the same computed 
networks, with colours asin panel d. f, Intensity of reflections (circles) made by 
3D FFT of: the SVSEM reconstruction (left; roughly 160 unit cells); the affine 
triclinic SCF model (centre; 64 unit cells); and the non-affine triclinic SCF model 
(right; 64 unit cells). For reference, the allowed peaks for cDG are plotted with 
open squares. 


prominently derived from the greater distance from the IMDS to the 
threefold junction node than from the corresponding IMDS-to-sketelon 
distance at the mid-portion of an internode strut”*”’”. This notion of 
‘packing frustration’ is consistent with the broad spread of minor-block 
length. However, the measured distribution of thicknesses for the major 
PS block also exhibits substantial spread, indicating that—contrary to 
the present heuristic picture**—packing of the majority block is also 
frustrated. Non-uniform matrix thickness thus arises as aconsequence 
of the ‘tug of war’ between majority-block stretching and the counter- 
vailing forces that favour more uniform minor-block lengths, as well as 
from the drive towards area-minimizing IMDS shapes”*”’, consistent 
with the more CMC like distribution observed in Fig. 2b. 

Although the IMDS shape and domain thickness are necessarily 
inhomogeneous even in the ideal cDG, the internode struts in the ideal 
cDG are uniform in length and all orient along <110> directions. The 
experimental network morphology can be further analysed using the 
skeletal graphs (Fig. 3a). We first consider the dihedral angle (Fig. 3b). 
In an ideal cDG network this angle is +70.5° (modulo 180°), where the 
sign characterizes the chirality of the two enantiomorphic single-gyroid 


networks”. Remarkably, the experimental dihedrals for the positive and 
negative networks deviate little (with a root-mean-squared variance of 
lessthan11°) from the ideal cubic geometry values (Fig. 3c). We alsoshow 
(Extended Data Fig. 4) for the interstrut angles in the experimental vtDG 
a deviation of only about 20% from the perfect threefold coordination 
(120°) of cDG. This degree of local angular order inthe PDMS networks 
contrasts starkly with the pronounced variability in the length of tubular 
struts as measured by skeletal edges. Strut length can vary by upto 300% 
(Fig. 3d). Figure 3e, fanalyses the lengths of PDMS struts according to 
their orientation, and indicates astrong correlation between the orienta- 
tion and the 12 <110> directions of acDG graph. Struts ina given orienta- 
tionare relatively homogeneous in length, but show prominent length 
variations between distinct orientations—exceptionally large given the 
more modest (roughly 20%) discrepancy between triclinic and cubic 
cell geometry. Notably, in Fig. 3g we show concomitant contraction/ 
dilation of the transverse tubular radius with stretching/compression 
of the strut length. 

To understand the origin of the anomalously large variability among 
PDMS strut lengths versus the relative constancy of strut angles, we con- 
sider the microdomain structures and their associated skeletal graphs 
derived from SCF models of the tDG. The first model, affine triclinic, is 
generated by affinely deforming the SCF cDG structure (Fig. 4a) intoa 
particular tDG shape equal to that of the experimentally determined unit 
cell (Fig. 4b). Asecond model, non-affine triclinic, instead uses the same 
experimental triclinic cell boundary conditions to compute an equilib- 
rium SCF double-gyroid morphology (Fig. 4c). As shownin Fig. 4d, e, the 
affine triclinic deformation of the double gyroid leads to spread of the 
strut lengths and angles by roughly 10-20%, comparable in scale tothe 
imposed strains deforming the cubic cell to triclinic symmetry. Remark- 
ably, if instead we consider the predicted double-gyroid morphology 
that equilibrates within the same triclinic cell, the network structure 
adopts an increased degree of length dispersity (Fig. 4d)—well beyond 
the nominal lengths derived from cubic to triclinic distortion—and yet 
areduced degree of angle dispersity (Fig. 4e). Moreover, comparison 
of the spectrum of Fourier peaks from the experimental versus the SCF 
affine triclinic and non-affine triclinic models reveals extraordinary cor- 
respondence between the experimental structure and the non-affine 
triclinic model (Fig. 4f). 

Taken together, these observations suggest a strong thermodynamic 
coupling of sub-unit-cell morphology to symmetry breaking at the 
supra-unit-cell scale of the double-gyroid phase. Strut lengths (and 
diameters) are relatively soft and thus easily accommodate deformation, 
presumably through relatively rapid intradomain transport of polymer 
chains. By contrast, the angular geometry of the gyroid network is stiff, 
favouring local correlations that are maintained even under strong 
symmetry-breaking deformations. This suggests a heuristic model for 
the non-affine structure of symmetry-broken soft-matter networks, 
consisting of periodic networks of tensed struts (so-called Steiner net- 
works, which are 1D analogues of Plateau borders)’ that adjust lengths 
locally yet maintain force-balancing angular coordination at the nodes, 
in order to minimize the stretching that occurs in response to imposed 
changes in unit-cell symmetry. 

Our observations have been made possible by the accessibility of 
ultralarge volumes to SVSEM tomography, in combination with the 
Bragg averaging of selected volumes, in order to achieve enhanced 
resolution of sub-unit-cell features. New distance and angle metrics 
applied to the complex double-gyroid phase allow deeper insight into 
the complex energetic competition, as reflected in the distinctive 
structural distortions in actual samples that invariably result from sol- 
vent evaporation and grain boundary incompatibility. The symmetry- 
breaking distortions are predicted to have impacts on, for example, the 
photonic/phononic band properties of double-gyroid assemblies””. 
Our research opens up a new way of unambiguously characterizing a 
variety of soft-matter systems that assemble under different processing 


conditions into a variety of soft crystals (beyond networks), illuminat- 
ing their formation mechanisms, supra-cell and sub-cell structures and 
structure-property relationships. 
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Methods 


Material and sample preparation 

We synthesized the polystyrene-poly(dimethylsiloxane) (PS-PDMS) 
diblock copolymer” by sequential anionic polymerization of styrene 
and hexamethylcyclotrisiloxane. The polymer has number average 
molecular weights of 43.5 kg mol (for PS) and 29.0 kg mol (for PDMS), 
with an overall composition of 40% PDMS (by volume) and a polydisper- 
sity index of 1.04. The sample studied was cast slowly (over the course 
of one week) froma 10 wt% solution (2 ml) in toluene. After drying, the 
sample was heated to 60 °C for 3 days in a vacuum in order to remove 
any residual solvent. 

On the basis of characteristic 2D TEM and radially averaged SAXS, the 
PS-PDMS diblock copolymer has been reported to have a double-gyroid 
morphology”. We further characterized the small piece of sample that 
we used for SVSEM with the synchrotron X-ray at Sector 12-ID-B of the 
Advanced Photo Source in the Argonne National Laboratory. Given the 
SAXS pattern (Extended Data Fig. 5), the structure can indeed benom- 
inally associated with a double-gyroid morphology, with an average 
cubic repeat of D=130 nm. However, we observe a prominent low gq peak 
associated with the {110} planes that is forbidden for the cubic la3d 
space group. Given the size of the incident X-ray beam and the sample 
thickness, this SAXS pattern must come from about 10’ unit cells. Before 
SVSEM imaging, the sample was attached to a45° SEM stub with double- 
sided conductive carbon tape, and the outer surface was then coated 
with a50-nm layer of platinum. 


Slice-and-view SEM data acquisition 

Extended Data Fig. 6 shows the workflow involved in SVSEM. This is a 
tomographic method enabled by advances inion milling, monochro- 
mated field emission electron beams and electron imaging detectors, 
as wellas by precise stage motion and sophisticated software routines 
for correlation and registration of the image stack” *. During data col- 
lection (Extended Data Fig. 6a), ion milling is combined with electron 
imaging (Extended Data Fig. 7a): an incident high-energy ion beam is 
used to make animpact on and mill away a thin slice of the near-surface 
region of the sample (Extended Data SFig. 7b, i); then an electron beam 
is directed at the surface and a secondary-electron image is recorded 
(Extended Data Fig. 7b, ii). This ion-slice, electron-image sequence is 
repeated until a sufficient thickness of the sample has been serially 
imaged. The 3D tomogram is then reconstructed by vertical stacking 
of the aligned 2D SEM images. Three large-volume tomographic data 
sets from distinct double-gyroid grains are available at an online data 
repository (https://doi.org/10.7275/wv24-3j62). 

We used a Thermo Fisher Helios NanoLab 660 SEM/FIB DualBeam 
system for data acquisition. A focused gallium-ion (Ga*) beam with an 
energy of 30 KeV and arelatively low beam current of 80 pA was used to 
millthesamplesurface in order to minimize damage fromthe FIBbeam*. 
A1-KeV electron beam with beam current of 50 pA was used to image the 
sample surface witha through lens (TLD) secondary-electron detector 
(secondary-electron images taken with different incident energies are 
shown in Extended Data Fig. 8). Notably, the stronger scattering from 
the higher atomic number of silicon atoms inthe PDMS and the resulting 
additional secondary-electron emissionis sufficient to provide excellent 
intrinsic contrast between the PS and PDMS domains without staining. 
We used X-shaped fiducials to register the FIB and secondary-electron 
images during the automatic slice-and-view process, and drilled a deep 
hole (or holes) into the sample along the direction normal to the observ- 
ing surface with FIB (at an acceleration voltage of 30 kV and beam current 
of 0.23 nA) for fine registration of secondary-electronimages (for details, 
see Extended Data Fig. 9a—c). For FIB slicing, we set the slice thickness at 
3nm. Further monitoring during FIB image acquisition found the actual 
slice thicknesses to be 2.96 + 0.01nm per slice (for details, see Extended 
Data Fig. 9d). A potential relative rotation of SEM images during SVSEM 


acquisition was also monitored and excluded (Extended Data Fig. 10 
and Supplementary Video 3). 


SCFT and IMDS models 

We performed self-consistent (mean) field theory calculations of 
diblock copolymer melts using a polymer self-consistent field (PSCF) 
code (http://pscf.cems.umn.edu/) as in ref. *° for cubic double gyroid 
(cDG) and non-affine triclinic double gyroid (natDG) morphologies, 
with a single-chemical-parameters family: that is, a volume fraction 
f, product of interblock repulsion y, and degree of polymerization N. 
The initial density-field profiles for a double gyroid were based onthe 
example files distributed with the software, and later targeted for the 
40% minority (PDMS) and 60% majority (PS) volume fractions. 

For the natDG we generated an initial density field that is identical to 
the experimentally reported lattice dimensions. We performed SCFT 
calculations by iteratively changing the density fields while calculat- 
ing the free energy at each step and progressing towards a free-energy 
minimum. Then, for cDG and natDG, we further adjusted the lattice 
dimensions while maintaining the unit-cell length ratios in order to find 
ametastable density field subject to imposed symmetry constraints. For 
natDG, we fixed a/b=0.97/1, a/c=1.10/land angles a, B, yas 75.9°, 84.7°, 
89.1°, which derive from the unit-cell parameters computed from the 
optimal (inverse) reciprocal lattice dimension from experimental SVSEM 
measurements, and are therefore close to those of the local volume 
region shown in Fig. 1b. Here we did not attempt to achieve an equiva- 
lent segregation strength (yN) to match PS-PDMS systems, because it 
is unclear which equilibrium values correspond best to the conditions 
inwhich the sample (as it is undergoing solvent evaporation) becomes 
ordered but remains fluid in both domains. Instead, we aimed to capture 
the basic mechanism of the sub-unit cell non-affine structure without 
imposing a close match in interblock repulsion. Hence, for fixed sym- 
metries (cDG or natDG) we consider the cell dimension that minimizes 
the free energy for fixed ratios of unit-cell dimensions and angles. The 
affinely sheared triclinic DG (atDG) was generated from cDG, by apply- 
ing atransformation matrix that consisted of lattice vectors matching 
the natDG at the same composition and interaction parameters. Finally, 
for each of the cDG, atDG and natDG, we generated image stacks of the 
density fields for further geometric analysis. 

For the IMDS models in Extended Data Fig. 3a, b, we generated the 
CMC surface of a tubular double-gyroid network by using Surface 
Evolver” and following the procedure in ref. 7. Thus we generated dis- 
crete CMC surfaces corresponding to the IMDS by minimizing the area 
and energy associated with a target mean curvature (H,) for the IMDS 
(thatis, {(H(x) - Ho)?dA, where H(x) is the mean curvature of the surface 
at point x, and dA is the infinitesimal area at point x) by imposing volume 
constraints such that the minority component volume enclosed by the 
tubular networks is 40% of the cubic cell. Similarly, for the CMT IMDS, 
we took a discretized surface of the gyroid minimal surface and then 
pushed off all points by the same distance along the normal (+f), such 
that the resulting volume enclosed by surfaces on either side of the 
minimal surface matched the 60% volume fraction of PS inthe unit cell. 
See the supporting software at the online data repository (https://doi. 
org/10.7275/wv24-3j62) for details on generating image stacks from 
density fields. 


Morphological analysis 

The following subsections describe the steps (Extended Data Fig. 6b- 
g) that ultimately led to a detailed geometric analysis (Extended Data 
Fig. 6h). The custom computer codes used for these tasks are noted in 
each subsection, and the file ‘README.txt’ distributed with the sup- 
porting software outlines further instructions regarding the use of 
these codes with source data. All of the supporting software codes and 
‘README.txt’ are at the online data repository (https://doi.org/10.7275/ 
wv24-3j62). 


Visualization of 3D volumes using raw secondary-electron images. 
See Extended Data Fig. 6b, c. Visualization was carried out with Avizo 
software from Thermo Fisher. The raw secondary-electron images of the 
PS-PDMS BCP have excellent intrinsic contrast. We further segmented 
the 3D tomogram by setting the initial voxel intensity threshold in or- 
der to manually define a portion of the bright, higher-intensity region 
(PDMS) as well asa portion of a darker, lower-intensity region (PS). We 
then calculated the position of the boundary between the brighter and 
darker regions using a gradient algorithm. The watershed operation® 
was then applied to fill inthe two types of region. This procedure avoided 
the creation of any internal islands within each type of domain. After 
using a given threshold, we checked the volume fraction of each block 
against the known value from nuclear magnetic resonance (NMR: 40/60 
PDMS/PS). Using these segmented images, we can reconstruct a3D 
volume as coloured PDMS networks with a transparent PS matrix. 


Selected-volume diffraction of SVSEM reconstruction. See Extended 
Data Fig. 6d-f. Because SVSEM captures the structure over large dimen- 
sions in every direction of the sample, it enables high-resolution analysis 
in reciprocal space, affording local observation of the orientation and 
magnitude of distinct Bragg-like intensity regions without loss of phase 
information. Before transforming the real-space volume data into Fou- 
rier space, we applied a Hanning window” in order to reduce artefacts 
inthe FFT associated with discontinuities at the sample boundary. The 
filtered intensity of each real-space image voxel is given by: 


I(u, v, W) = 3p o cos =D c = cos -D } 
x c - cos?) tu, v,W) 


where /,(u,v,w) is the original unfiltered voxel intensity; N,,, NV, and N,, 
are the number of voxels in each dimension; and (u, v, w) denote the 
integer voxel positions inthe real-space data. After applying the Hanning 
window, the voxel intensity data were Fourier transformed in order to 
obtain the reciprocal space representation, F(i,j, k). 

To perform further selected-volume-diffraction (SVD) analysis, we 
needed an indexed reciprocal space lattice that fits with the FFT data. 
To achieve this, we started with a small portion of acDG reciprocal 
space lattice, G,,,. We included the two families of non-forbidden recip- 
rocal lattice vectors with the smallest magnitude of G,,,, {211} and {220}. 
These cubic reciprocal-space vectors have a magnitude of 
|b,|=211/130 nm“, where 130 nmis the average unit-cell length estimated 
from SAXS. In order to fit the generated lattice to the FFT data, we con- 
structed asmooth interpolation of the FFT data, making it into a cubic 
spline interpolation, Femooth(Kx Ky, K,). The coordinates of each FFT pixel 
are given by (k,, k,, K,) = (i5k,,j5k,, K5k,), where 5k, = 211/5x, 5k, = 211/Sy 
and 6k, =21/6z, and 6x, Sy and &zare the real-space voxel dimensions. 
We then fit the reciprocal lattice to the FFT interpolation. In doing so, 
we assumed that the structure was periodic and that the real-space 
lattice was affinely transformed froma cubic lattice. Applying a linear 
transformation x; > x’;=A,x; where the nine matrix elements A, are 
independent, we could transform from a cubic to a triclinic unit cell. 
For the transformation to be linear, we required that k;x;=kx;, So 
kj= kj Aji. The goal of our fitting procedure was to find the elements of 
A“ that transform the cubic reciprocal lattice vectors to lie on the peaks 
of the FFT. Specifically, we optimized the matrix elements A,in order 
tomaximizethesummed value of interpolated intensity at the deformed 
reciprocal lattice vectors, ) eG ny Femootn(A 7 kj) Where G,,are the recip- 
rocal lattice vectors of the cubic double gyroid lattice. Having found 
the indexed (deformed) reciprocal lattice that fits the FFT data, we 
applied a targeted Bragg filter to the volume data. The Bragg filter was 
applied in Fourier space with a mask of Gaussian windows, while each 


window was centred onthe selected reciprocal-space lattice point. The 
Bragg mask is given by: 


(6k, -k)? (5k, — ky)” |, (k6k.-k.)? 


202 207, 202 


Bi,j,kK)= Y exp 
keGiy 


Where o,, 0, and, are widths of Gaussian windows in the mask and Gj, 
are the set of vectors in the (non-cubic) reciprocal lattice that locate 
each of the intensity peaks in the 3D FFT. We applied the filter pointwise 
to the FFT data and then applied an inverse FFT in order to obtain the 
filtered real-space volume data for further analysis. To carry out SVD 
analysis, we selected reciprocal-space lattice points on the basis of the 
corresponding overall intensity of each diffraction family of the FFT 
data and their associated cubic g value (Extended Data Fig. 11). Here we 
chose reciprocal-space lattice points such that their corresponding 
diffraction families have an overall intensity greater than10~ (normal- 
ized by the strongest {211},,,, family) and their associated cubic q values 
are smaller than 0.2 nm to make the Bragg filtering mask (that is, the 
110}.n¢, 200}.p¢,4210} ne, 2 ing, {220 }.0¢, (310 }.ne, B2B vc, 400} inc 
families), while the standard deviation of the Gaussian window is two 
pixels (thatis, 0,=25k,, 0,=25k,, 0,=25k,). A comparison of the 3D FFT 
pattern of the raw volume data and the 3D FFT pattern of the volume 
data after SVD treatment is shown in Extended Data Fig. 6d, e. See the 
supporting software 2. 


Network skeletal graphs and analysis. See Extended Data Fig. 6g,h 
(with regard to dihedral angles, strut lengths and strut orientations). 
Using Image] (https://imagej.nih.gov/ij/), we binarized the greyscale 
image-stack data in order to identify the tubular networks formed by the 
minority domains, and separated them from the majority-block-filled 
matrix by using a threshold such that the volume fractions of the two 
binary components matched with the experimentally reported volume 
fractions. Note that although the same analysis could be applied to 
post-Bragg filtered data, the data in Fig. 3 consider alarger volume than 
is accessible to memory limitations of FFT filtering in Mathematica (see 
https://doi.org/10.7275/wv24-3j62). Hence, in order to analyse large- 
volume networks, we extracted the skeleton directly fromthe raw SVSEM 
data (comparative analyses of skeletons from pre- and post-filtered data 
inasmaller volume confirm that Bragg filtering has a negligible impact 
onnetwork statistics). 

We then reduced these networks into 1D skeletal graphs—that is, 
straight-line bonds that connect nodes which are threefold coordi- 
nated or higher; no fourfold or higher-fold nodes were identified in 
this way (indicating the absence of topological defects®). The initial task 
of reducing filtered 3D volume data into 1D lines was done using the 
inbuilt skeletonization feature in Image). This procedure followed ref. *°, 
and it reduces binarized volume datainto a1D curve (also referred asa 
medial axis) that is a collection of voxels. To identify the skeletal graphs, 
we subjected the 1D curve to further refinements. We did this using a 
custom Mathematica code, whereby we first converted the voxel col- 
lections into a graph by taking the voxel coordinates as vertices, and 
then connected each voxel to its adjacent neighbours in a3 x3 x3 voxel 
neighbourhood. For the next refinement, we fixed the vertices that lie 
on the boundary and iteratively removed vertices that have only one 
nearest neighbour which effectively removed branches of the 1D curve 
that did not connect toanode. Finally, we converted the remaining 1D 
curve into a straight line of bonds by iteratively removing vertices with 
two neighbours and then connecting them to one another. The end of 
this process usually results in having small clusters of vertices at the 
site of anode, which we rectified by replacing them witha single vertex, 
ultimately resulting in the skeletal graph with the same topology of the 
network that we started out with. 

We then applied an optimization procedure to ensure that the skel- 
eton lines lie along the regions of maximal density in the 3D volume 
data. This was achieved using an algorithm described in ref. 7°, which 
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defines an optimization functional in @ = ¥;,, tS; ds @(x) that ave- 
ij 


rages the localintensity @(x) over the skeleton bonds, where <i/) denotes 
the skeleton bond connecting x;and x,,and s represents the arc length 
along each skeletal bond. We used a cubic spline interpolation to create 
the density (or intensity f) function @(x), based on reconstructed 3D 
volume data from images. We maximized this function with respect 
to node positions x; in order to optimize the skeleton position over 
density @. We analysed the structure of the optimized skeletal graphs 
by calculating the skeleton dihedral angles and the length and orienta- 
tion of the bonds that make up the graphs. For a given triplet of 
consecutive bonds, we defined the two planes and their normal 
AS Fag = (Fy % B)/|(Fy x Fg) aNd fig, = (Fg x f,)/I(Py x f,)| where f,, fz andf, 
are the unit vectors along the bonds. The dihedral angle is defined 
as the angle between these plane normals, with sin6 = (fgg x Ag) fg, 
COSO = Aigg - fig, We calculated this measure for all consecutive triplets 
of bonds in the skeletal graph. We also calculated the bond length as 
[,= |r| where r,=x;- x, with iandj denoting the nodes (end points) of 
the struts, and the spherical angle coordinates O and @ describing the 
length and orientation anisotropy. The node angle w = cos \(f,- fs) is 
computed for each of the three pairs (a, B) of struts that meet at a single 
node; this is done for all nodes in the unit cell. Data from this analysis 
are presented as a polar histogram of dihedral angles in Fig. 3c, a histo- 
gram of node angles in Extended Data Fig. 4, Mercator plots of orienta- 
tion in Fig. 3e, f, and anisotropy of strut length in Fig. 4d. See the 
supporting software 3, 4. 


Calculation of curvature. See Extended Data Fig. 6g, h (with regard to 
IMDS curvature). We computed the mean curvature (H) and Gaussian 
curvature (K) of the IMDS. The IMDS is represented as a triangulated 
mesh, which we identified by finding a surface of the linear interpolation 
of density data at @,, that separated the 3D volume into three types of 
domain with 20%, 20% and 60% volume. 

We further used two-step conditioning by first applying an edge- 
length regularization to the mesh, and then constraining the mesh 
vertices to lie on the isosurface of a third-order interpolation of the 
density to ensure that mesh vertices represent a surface that is at least 
second-order differentiable. To regularize the triangle edge lengths, 
we minimized a regularization functional defined as Freg= dw (Ly - [). 
We optimized this functional via a gradient-descent approach by taking 
the gradient with respect to the triangle vertex positions, and applied 
aconstraint such that all resulting vertices lie on the surface by subtract- 
ing the component of gradient parallel to vertex normal. To constrain 
the mesh vertices, we created a third-order Hermite interpolation of 
density @ and we constrained each triangle vertex to lie along the @=0.4 
isosurfaces within this interpolation. We accomplished this by minimiz- 
ing (p- 0.4)? for each vertex, again using the gradient-descent approach. 
We finally used the patch curvature function in the Matlab File Exchange 
developed by D.-J. Kroon (https://www.mathworks.com/matlabcentral/ 
fileexchange/32573-patch-curvature) in order to compute curvatures 
on the optimized triangulated mesh that represents the IMDS. This 
algorithm calculated the principal curvatures x, and x, associated with 
each triangulated vertex by fitting a paraboloid to that vertex and its 
nearest neighbours, with the paraboloid axis constrained along the 
vertex normal. From the principal curvatures, the mean curvature 
H= ac and Gaussian curvature K = k,x, can be calculated for each 
vertex. Curvature-distribution data are shown in Fig. 2b and Extended 
Data Fig. 3. See also the supporting software 5, 6, 7. 


Calculation of skeleton-IMDS and IMDS-IMDS distances and 
strut diameters. See Extended Data Fig. 6g, h (with regard to the 


skeleton-IMDS and IMDS-IMDS distances). We used the optimized 
IMDS triangulated mesh and discretized skeletal graph to compute 
skeleton-IMDS and IMDS-IMDS distances and the effective diameters 
of the tubular networks. We carried out skeletal-graph discretization by 
choosing a discretization length d = </,)/100, where </,) is the average 
bond length and for each bond we chose |,/d evenly spaced points along 
the bond. We calculated distances between the skeleton and IMDS by 
finding the nearest skeleton point for each vertex onthe IMDS triangu- 
lated mesh, and IMDS-IMDS distances by considering separately the 
two IMDS surfaces resulting from individual networks and finding the 
nearest vertex in one IMDS from each vertex on the other. To calculate 
the effective strut diameter, we found the (quasi-ellipsoidal) 1D inter- 
section of the computed IMDS and a 2D plane that bisects a strut. An 
‘average tube radius’ is computed by dividing the length of the 1D path 
(that is, the circumference) by 211. The skeleton-IMDS and IMDS-IMDS 
distances are plotted in Fig. 2d, and the effective strut diameter versus 
strut length in Fig. 3g. See also the supporting software 8, 9. 
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Extended Data Fig. 1| Variation in the unit-cell parameters of triclinic unit coherent but non-cubic, with unit-cell parameters exhibiting only small 
cells within one grain. Unit-cell parameters at five different places (1-5) within deviations throughout the many-cubic micrometre grain. 
one grain are measured in real space. The result indicates that the structure is 
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Extended Data Fig. 2| Strain eigenvector mapping froma cDG lattice to a vtDG lattice within the slicing coordinate frame of reference. Directions and 
magnitudes of deviations from cubic symmetry in different grains of the sample are not correlated with the ion-milling (slicing) direction (Z). 
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Extended Data Fig. 3 | Mean (H) and Gaussian (K) distribution of IMDS surface (b). c-f, Distributions obtained from SCF theoretical calculations of cDG 
curvature in theoretical models. a, b, Distributions are shown for aconstant as a function of segregation strength: yN=12.5 (c), yN=15 (d), yN=25 (e) and 
mean curvature (CMC) surface (a) and for aconstant matrix thickness (CMT) XN=335 (f). As in the main text, Dis the cubic unit cell repeat length. 
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Extended Data Fig. 4| Histogram showing the internode angles of 
experimental vtDG (from SVSEM) and non-affine triclinic SCF models. 
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Extended Data Fig. 5 | SAXS pattern froma region of the bulk polygranular 
PS-PDMS sample. The structure can be nominally associated with a double- 
gyroid morphology, with an average cubic lattice parameter of D=130nm. 
Diffraction from the {110},p, and {200},p, families, which are forbidden for the 
cubic /a3d space group, are observed, indicating the non-affine deformation of 
the cubic double-gyroid lattice. 
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Extended Data Fig. 6 | The workflow for collection and analysis of SVSEM tomography data. See Methods for more details. 


(a) Serial acquisition of High-resolution SEM images Image Processing / 
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Extended Data Fig. 7 | Acquisition and processing of SVSEM images. images (more than 200), the 3D morphology can be constructed via alignment 
a, Illustration of the SVSEM reconstruction method. Instep 1, low-energy ofthe stack of slices. b, Different sample stage positions are used for undistorted 
incident electrons (1 KeV) are used to image the near-surface region of abulk imaging during slice-and-view. i, For ion milling during slicing, the sample 
sample. In step 2,a Ga* beamis used to slice aroughly 3-nm-thick section from observing surface is parallel to the ion beam. ii, For electronimaging, the sample 
the sample surface. In step 3, electrons are again used to image the ion-beam- observing surface is perpendicular to the electron beam. 


milled sample surface. The process is repeated. With a large enough number of 
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Extended Data Fig. 8 | Secondary-electron images acquired with different voltage, there is aclear binary separation of the pixels into a dark peak (left) and 
electron-accelerating voltages. Corresponding raw greyscale pixel-intensity a bright peak (right). Eachimage is froma freshly sliced region. 
distributions are presented blow the electron images. With a lower accelerating 


(a) FIB Image for registration 
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(b) SEM Image for registration 
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Extended Data Fig. 9 | Alignment fiducials and monitoring of slice thickness 
during experiments. a, An X-shaped fiducial (within the red square) inion-beam 
viewis used for the registration of FIB slicing. b, An X-shaped fiducial (within the 
yellow square) in electron-beam viewis used for registration of electron 
imaging. The round cross-section of the perpendicularly drilled hole (within the 
orange square) is used for the fine registration of secondary-electron images. 


(c) SEM Image for data acquisition 


d) Monitoring the slice thickness 
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c, Asecondary-electron image used for data acquisition, showing anion-milled 

hole (within the orange square) used for fine registration. d, Monitoring of slice 
thickness: we measured the distance between the milling surface of thenthslice 
and the milling surface of the first slice (total slice thickness d) from FIB images, 
and then plotted d versus (n-1). This reveals a linear relationship with a slope of 
2.96 + 0.01, whichis the averaged slice thickness (innm). 
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Extended Data Fig. 10 | Monitoring of potential SEM image rotation during 
FIB-SEM image collection. a, SEM raw data image of the region of interest, with 
two holes drilled normal to the slice surface by the FIB. b,c, Side-view snapshots 
of 3D reconstructed holes land 2 (the corresponding rotational videos arein 
Supporting Video 3). The image stack (80 slices) was aligned using hole 1. The 
reconstruction of hole 2 is still symmetric, indicating no image rotation. 
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Extended Data Fig. 11| Important Fourier components of the experimental 
double-gyroid structure. The overall intensity of each diffraction family was 
normalized by the strongest {211},,, family from the 3D FFT of the region 
containing approximately 160 unit cells (the same region as in Fig. 4f, which 
shows the intensity of each individual peak). The overall normalized intensity 
data are plotted against their associated cubic g value for those planes. For our 
reconstructions, we usea Bragg filter that selects peaks above an intensity 
threshold of 107 (Inin) for associated g values smaller than0.2nm7. 
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Methane is a powerful greenhouse gas and is targeted for emissions mitigation by the 
US state of California and other jurisdictions worldwide’. Unique opportunities for 
mitigation are presented by point-source emitters—surface features or infrastructure 
components that are typically less than 10 metres in diameter and emit plumes of 
highly concentrated methane®. However, data on point-source emissions are sparse 
and typically lack sufficient spatial and temporal resolution to guide their mitigation 
and to accurately assess their magnitude’. Here we survey more than 272,000 
infrastructure elements in California using an airborne imaging spectrometer that 
can rapidly map methane plumes> ’. We conduct five campaigns over several months 
from 2016 to 2018, spanning the oil and gas, manure-management and waste- 
management sectors, resulting in the detection, geolocation and quantification of 
emissions from 564 strong methane point sources. Our remote sensing approach 
enables the rapid and repeated assessment of large areas at high spatial resolution for 
a poorly characterized population of methane emitters that often appear 
intermittently and stochastically. We estimate net methane point-source emissions in 
California to be 0.618 teragrams per year (95 per cent confidence interval 0.523- 
0.725), equivalent to 34-46 per cent of the state’s methane inventory’ for 2016. 
Methane ‘super-emitter’ activity occurs in every sector surveyed, with 10 per cent of 
point sources contributing roughly 60 per cent of point-source emissions—consistent 
with a study of the US Four Corners region that had a different sectoral mix’. The 
largest methane emitters in California are a subset of landfills, which exhibit 
persistent anomalous activity. Methane point-source emissions in California are 
dominated by landfills (41 per cent), followed by dairies (26 per cent) and the oil and 
gas sector (26 per cent). Our data have enabled the identification of the 0.2 per cent of 
California’s infrastructure that is responsible for these emissions. Sharing these data 
with collaborating infrastructure operators has led to the mitigation of anomalous 
methane-emission activity”. 


Methane (CH,) is being increasingly prioritized for near-term climate 
action, given its relatively short atmospheric lifetime and the potential 
for rapid, focused mitigation that can complement economy-wide 
efforts to reduce carbon dioxide emissions. In California, efforts to 
mitigate methane emissions are complicated by large inconsisten- 
cies between estimates of emissions derived from atmospheric meas- 
urements and from greenhouse-gas inventories: past studies using 
atmospheric measurements report methane emissions that are higher 
than those from inventories, both statewide" © and in key regions and 
sectors'*’, Other studies indicate that methane emissions from the 
oil and gas supply chain are about 60% higher than those reported 
in the national greenhouse-gas inventory” and that there is a heavy- 
tail distribution of methane-emission sources in the US natural gas 
supply chain, where typically fewer than 20% of sources (so-called 


super-emitters) contribute more than 60% of total emissions from 
that sector”. Scientists and policymakers have emphasized the rapid 
identification and mitigation of methane super-emitters, particularly 
those due to leaks and abnormal operating conditions". 

In addition to California, there remain large uncertainties regarding 
the distribution of methane emissions in other key regions and emission 
sectors globally”. There is a dearth of available observational studies 
of sectors suchas livestock manure management and landfills, both of 
which are predicted to be larger contributors to California’s methane 
budget than the oil and gas sector®. In addition, spatially sparse and 
infrequent field studies can overestimate or underestimate important 
methane sources that are intermittent or highly unpredictable. Finally, 
the relative contributions of methane point sources and area sources 
have not been well studied in California. We define ‘point source’ as a 
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Fig. 1| Images from our survey of methane point emissions in California. 
Main image, approximately 2,000 individual AVIRIS-NG flight lines from 2016 
(blue) and 2017 (green) covered more than 272,000 individual facilities and 
infrastructure elements. Detected sources are indicated by red points, with 
the densest clusters seen in the San Joaquin Valley (dairies and oil fields). 

The inset images show examples of representative methane plumes from 
different sectors: a, compressor stations at a natural gas storage facility; 

b, oil well; c, tank of liquefied natural gas; d, dairy manure management; 


condensed surface feature or infrastructure component of less than 
10 min diameter that emits plumes of highly concentrated methane. 
This contrasts with an ‘area source’, or the combined effect of many 
small emitters distributed over a large area (typically 1-100 kmacross) 
that releases methane ina more diffuse fashion; area sources include 
anaerobic decomposition from rice cultivation and enteric fermenta- 
tion from ruminant animals, both of which are better addressed with 
other measurement methods and are not included here. 

The California Methane Survey was designed to provide the first 
systematic survey of methane point sources across the state, with 
a focus on detecting, geolocating and quantifying super-emitters. 
This survey fills an important gap in scale, and complements other 
observational systems that provide aggregate constraints on emissions 
from regions and area sources” ” and short-duration field campaigns 
that are limited to a small number of facilities**. The survey was con- 
ducted with the Next Generation Airborne Visible/Infrared Imaging 
Spectrometer (AVIRIS-NG). AVIRIS-NG measures ground-reflected 
solar radiation at wavelengths from 380 nm to 2,510 nm with 5-nm 
spectral sampling, and has a1.8-km field of view and 3-m pixel resolu- 
tion at typical survey altitudes of 3 km (ref.°). This class of instrument 
is unique interms ofits high signal-to-noise ratio, calibration accuracy 
and response uniformity”. The methane retrieval is based on absorp- 
tion spectroscopy®’” and can reliably detect and quantify methane 
point sources with emissions typically as small as 2-10 kg CH, h“ for 
typical surface winds of 5 ms‘, depending on surface brightness and 
aircraft altitude and ground speed. See the Supplementary Informa- 
tion for a detailed description of datasets, estimation methods and 
validation. 

The spatial and sectoral scope of this survey comprised key meth- 
ane point-source emission sectors in California, including: oil and 
gas production, processing, transmission, storage and distribution; 


1,000 


e, wastewater-treatment plant; f, landfill. The colour scales indicate the methane 
concentration enhancement (the mass of methane ina plume relative to 
background air) in each pixel in units of parts per million-metre (ppm-m). Inset 
images are from AVIRIS-NG. The basemap image is from Google Earth, Lamont- 
Doherty Earth Observatory (LDEO)-Columbia, National Science Foundation 
(NSF), National Oceanic and and Atmospheric Administration (NOAA), Landsat/ 
Copernicus, Scripps Institution of Oceanography (SIO), US Navy, General 
Bathymetric Chart of the Oceans (GEBCO). 


refineries; dairy manure management; landfills and composting facili- 
ties; wastewater-treatment plants; gas-fired power plants; and liqui- 
fied and compressed natural gas facilities. Multiple overflights were 
conducted for the same infrastructure over several years to assess 
source persistence. 

AVIRIS-NG flights for this study were conducted during five cam- 
paigns: August to November 2016, March 2017, June 2017, August to 
November 2017, and September to October 2018. The survey imaged 
approximately 59,000 km’, including revisits (Fig. 1). The survey was 
designed to cover at least 60% of methane point-source infrastructure 
in California, guided by a Geographic Information System (GIS) dataset 
known as Vista-CA (see Supplementary Information). Approximately 
272,000 infrastructure elements were covered by the survey, most 
of which were observed multiple times. The survey included more 
than 200,000 oil and gas wells and related production infrastructure, 
representing a sample size more than 500 times larger than previous 
point-source persistence studies”. 

The AVIRIS-NG flights conducted during this survey detected 1,181 
individual methane plumes; for each plume we estimated the enhance- 
ment (the mass of methane in the plume relative to background air) 
and attributed it to a Vista-CA infrastructure element (Fig. 1). Average 
emission rates and lo uncertainties were estimated for 564 distinct 
sources at 250 facilities, using observed methane enhancements and 
surface wind speed data from weather reanalysis products. The sum of 
our measured source emissions is 0.511 Tg CH, yr ‘and we apply anon- 
parametric bootstrap analysis to the population of observed sources 
to calculate a 95% confidence interval of 0.433-0.601 Tg CH, yr“. The 
population has a heavy-tail distribution, indicating that 10% of the 
point sources are responsible for 60% of the detected point-source 
emissions (Fig. 2 and Supplementary Information), spanning every 
sector surveyed. 
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Fig. 2| The distribution of point-source emissions is consistent between two 
different regions. a, Data from 564 methane point sources for all sectors in 
California (red; this study) and from 250 coal, oil and gas sources from the Four 
Corners region (blue’). The numbers for California have not been adjusted for 
persistence here, as this was not possible for the brief Four Corners study. The 
heavy-tail distribution indicates that 10% of the point sources are responsible 
for 60% of the detected point-source emissions. b, Histogram showing the 
density of point-source emissions with lognormal fits. Note that the Four 
Corners region includes some large emitters associated with coal production 
that do not occur in California. The vertical dashed lines indicate typical 
detection limits for this class of infrared imaging spectrometer, ranging from 
2-10 kg CH, hh" for the typical 3-km flight altitudes used in this study to 

100kg CH, h" for an equivalent satellite in low Earth orbit. 


The repetitive, high-spatial-resolution plume imagery enabled us 
to characterize point-source behaviour and controlling processes, 
particularly for sectors that have not been as well studied as the oil and 
gas production sector. Many of the sources were highly intermittent, 
with a median persistence of 0.20 for the entire population (mean 
0.33, range 0.02-1.0). Insome cases, the intermittent emissions can be 
explained by normal operations (for example, periodic waste flushing 
at large dairies). In other cases, more persistent activity is apparently 
due to sustained venting at a small number of anaerobic digesters at 
dairies and wastewater-treatment plants, or to leaking bypass valves at 
natural gas compressor stations. We find a similar distribution of persis- 
tence (20-35% on average) and emissions in the manure-management, 
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wastewater-treatment and oil and gas sectors. Solid-waste manage- 
mentis the largest methane point-source emission sector in California 
(Table 1), with persistent plumes observed at only 32 of 436 surveyed 
landfills and composting facilities. Our imaging of landfills identified 
methane plumes associated with construction, gaps in intermediate 
cover and leaking gas-capture wells—indicating a subpopulation of 
anomalous emitters (see Supplementary Information). The fact that 
we did not detect a larger population of smaller methane point sources 
across the landfill sector suggests that most of those facilities emit 
methane as area sources that cannot be detected with this method. 

Given that we surveyed a large fraction (32-100%) of every point- 
source emission sector in California, we can upscale our measurements 
to estimate statewide point-source emissions, resulting in a total of 
0.618 (95% confidence interval 0.523-0.725) Tg CH, yr ‘—equivalent to 
34-46% of the California Air Resources Board (CARB) methane inven- 
tory’ for 2016. We find that solid-waste management contributes 41% 
of observed point-source emissions, followed by 26% from manure 
management and 26% from oil and gas (contrasting with the 32%, 39% 
and 25% of total methane emissions found for these sectors in the CARB 
inventory’). We estimate that upstream oil and gas production contrib- 
utes about 79% of the total oil and gas methane point-source emissions 
in California. Spatially, 85% of point-source emissions from upstream 
production are concentrated in the southern San Joaquin Valley (the 
highest oil- and associated-gas-producing region in the state), 14% in 
Los Angeles and Ventura counties, and 1% in the Sacramento Valley. We 
emphasize that the relative contribution of emission sectors probably 
varies in other regions around the world owing to regional differences 
in economic activity, age of infrastructure, and regulation. We also 
highlight that there are no doubt regional differences in the relative 
sectoral contributions of area sources (suchas urban gas-distribution 
systems) that are beyond the scope of this study. 

In addition to solid-waste management, other emission sectors may 
be greatly underestimated in the CARB inventory. When comparing 
our estimates of point-source emissions for those sectors in the CARB 
inventory most likely to include methane point sources, our sectoral 
estimates account for about 38% of the CARB inventory’s emissions 
from the wastewater-treatment sector, about 42% of emissions from 
the manure-management sector, and about 366% of the CARB inven- 
tory for the energy industries sector. The latter is probably associated 
with most refineries and asmall number of high-emitting power plants 
(see Supplementary Information). Large discrepancies are observed 
between many of the self-reported emissions from participating facili- 
ties and the AVIRIS-NG and independent airborne estimates (Fig. 3 and 
Supplementary Information). Moreover, our survey of point-source 
emissions in California and the US Environmental Protection Agency 
(EPA)’s Greenhouse Gas Reporting Program (GHGRP) for the entire US* 
are in agreement that 99% of point-source emissions come from facili- 
ties that emit at least 25 kg h' (see Supplementary Information). This 
is notable given that manure management and oil and gas production 
contribute more than half of the point-source emissions in our study, 
but are mostly not included in the GHGRP for California and are only 
partially represented in the total US GHGRP. 

We shared preliminary findings from our surveys—including images 
of methane plumes—with collaborating facility operators, who pro- 
vided verification with surface observations and/or explained the 
mechanisms underlying the observed emissions and persistence. Many 
of these collaborative efforts led directly to mitigation of the methane 
sources detected in the survey. For example, we discovered four cases 
of leaking natural gas distribution lines and one leaking liquified natu- 
ral gas storage tank (Fig. 1), which the operators confirmed, repaired, 
and requested verification of repair by follow-up AVIRIS-NG flights?°. 

The prevalence of methane super-emitter activity in multiple sec- 
tors in California suggests substantial potential for mitigation. We 
have found that 30 facilities could be responsible for around 20% of 
the 2016 CARB methane inventory, including many that exhibit large 


Table 1| Point-source emissions by sector 


IPCCsource Vista-CA Number of Number of Percentage Sectoral Number Measured Statetotal Statetotal95% Percentage 
category infrastructure Vista-CA surveyed _ surveyed scalar of sources emissions emissions confidence of total 
element infrastructure elements detected (TgCH,yr') (TgCH,yr") intervals emissions 
elements (Tg CH, yr") 
1A1 Energy Gas-fired power 435 238 55 1.83 7 0.007 0.013 0.007, 0.021 2] 
industries plants 
Refineries 26 26 100 1.00 37 0.015 0.015 0.008, 0.023 2.4 
Subtotals 461 264 57 1.27 44 0.022 0.028 0.015, 0.044 46 
1B2 Oil and CNG/LNG 208 132 63 1.58 6 0.002 0.003 0.003, 0.004 0.5 
naturalgas _ fuelling stations 
Natural gas 1,131 538 48 2:10 5 0.005 0.010 0.009, 0.012 1.6 
stations 
(non-storage 
compressor, 
metering, etc) 
Natural gas 216,774 68,548 32 3.16 5 0.004 0.012 0.010, 0.014 1.9 
pipeline 
(transmission, 
distribution) 
Natural gas 26 23 88 113 5 0.004 0.004 0.004, 0.005 0.7 
processing plants 
Natural gas 12 12 100 1.00 11 0.009 0.009 0.008, 0.010 1.4 
storage fields 
Oiland gas: wells 225,766 198,231 88 114 107 0.048 0.054 0.046, 0.063 8.8 
Oil and gas: 3,356 2,872 86 1.00 120 0.066 0.066 0.056, 0.076 10.7 
other production 
equipment 
Subtotals 447,273 270,356 60 116 259 0.137 0.158 0.135, 0.184 25.6 
3A2 Manure Dairy confined 620 443 71 1.40 215 0.115 0.161 0.137, 0.187 26.1 
management animal feeding 
operations 
4A1 Managed Landfills and 1146 436 38 1.11 32 0.229 0.255 0.175, 0.345 A113 
waste composting 
disposal facilities 
4D1, 4D2 Domestic and 148 57 39 2.60 12 0.004 0.012 0.005, 0.020 1.9 
Wastewater __ industrial 
treatment wastewater 
and treatment 
discharge Industrial NA NA NA 1.00 2 0.004 0.004 0.004,0.005 06 
wastewater 
treatment: beef 
processing 
Totals 449,648 271,556 60 1.21 564 0.511 0.618 0.523, 0.725 100.0 


The table summarizes the persistence (frequency)-adjusted point-source emissions found in this study according to sectors identified by the Intergovernmental Panel on Climate Change 
(IPCC), as well as estimated total emissions derived with population scalars. Most of the scalars are simply the ratio of the number of infrastructure elements identified by Vista-CA to the 


number of surveyed elements, with three exceptions (oil and gas: other production equipment; landfills and composting facilities; and industrial wastewater treatment), for which we further 


constrain or eliminate scaling. See Supplementary Information section 2 for details. 


discrepancies between reported and measured emissions (see Fig. 3 
and Supplementary). Our survey in California and a previous study of 
the Four Corners region in the US? exhibit consistent heavy-tail dis- 
tributions of methane point-source emissions (Fig. 2) despite the dif- 
ferent sectoral mixes for the two regions (the Four Corners emissions 
are associated primarily with oil, gas and coal production’). If similar 
distributions of methane point-source emissions occur in other key 
regions around the world, this could translate to as much as 8-11% of 
global greenhouse-gas forcing, assuming a100-year warming potential 
of 32 and 350 Tg CH, yr“ of total anthropogenic methane emissions 
for 2016 (refs. °°). Testing this hypothesis would require additional 
aircraft surveys and satellite observations that can provide the neces- 
sary combination of high spatial resolution, sensitivity and wide area 
coverage for other key regions globally. Those broader studies would 
also improve our understanding of waste and manure-management 
emissions, which, as in California, might dominate the emission budgets 
of other regions”. 


Detection limits for methane point sources could be relaxed by a factor 
often compared with the survey described here and still identify 90% of 
super-emitters if applied frequently over large areas that have emission 
distributions similar to those of California (Fig. 2). Because detection 
scales linearly with spatial resolution®’, mature technologies suchas that 
used here could be deployed for more efficient point-source monitoring 
across larger regions on high-altitude aircraft and satellites. Our high- 
performance infrared imaging spectroscopy would translate to a robust 
detection limit of 100 kg CH, h' for a satellite in low Earth orbit, depend- 
ing on spatial resolution (assuming a wind speed of 5 ms”). Widespread 
and sustained deployment of point-source remote sensing methods such 
as ours, when combined with near-continuous regional monitoring of 
distributed area sources by surface observations and other satellites, 
could greatly advance scientific understanding of methane budgets and 
efforts to manage them. Complete closure of the methane budget and 
effective mitigation will no doubt require a multi-tiered observational 
strategy, in which the methods demonstrated here could play a key part. 
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Fig. 3 | Independent airborne measurements of emissions from 
representative facilities on the basis of simultaneous flights or several 
visits. a, Simultaneous flights; b, average emissions from multiple non- 
simultaneous flights over several months. Orange bars show AVIRIS-NG 
estimates of point-source emissions, and blue bars show estimates by Scientific 
Aviation (Boulder, CO, USA) of facility net emissions™. Error bars indicate one 
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Data availability 


Radiance and reflectance products calibrated by AVIRIS-NG can be 
ordered from the AVIRIS-NG data portal at https://avirisng.jpl.nasa.gov/ 
alt_locator/. Retrieved methane images from flight lines in this study 
are available for download at https://doi.org/10.3334/ORNLDAAC/1727. 
Vista-CA infrastructure spatial layers are available for download at 
https://doi.org/10.3334/ORNLDAAC/1726. Images of methane plumes, 
Vista-CA layers and regional-scale methane-emission products for 
California can be viewed at https://methane.jpl.nasa.gov/. Tables of 
methane plume and source characteristics are provided in the Sup- 
plementary Information. 


Code availability 


Thecustom computer code or algorithms used to generate the results 
in this study can be made available to researchers upon request. 
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Anatomically modern humans originated in Africa around 200 thousand years ago 
(ka)'*. Although some of the oldest skeletal remains suggest an eastern African origin’, 
southern Africa is home to contemporary populations that represent the earliest 
branch of human genetic phylogeny>*. Here we generate, to our knowledge, the largest 
resource for the poorly represented and deepest-rooting maternal LO mitochondrial 


DNA branch (198 new mitogenomes for a total of 1,217 mitogenomes) from 
contemporary southern Africans and show the geographical isolation of LOd1’2, LOk 
and LOg KhoeSan descendants south of the Zambezi river in Africa. By establishing 
mitogenomic timelines, frequencies and dispersals, we show that the LO lineage 
emerged within the residual Makgadikgadi-Okavango palaeo-wetland of southern 
Africa’, approximately 200 ka (95% confidence interval, 240-165 ka). Genetic 
divergence points to a sustained 70,000-year-long existence of the LO lineage before 
an out-of-homeland northeast-southwest dispersal between 130 and 110 ka. Palaeo- 
climate proxy and model data suggest that increased humidity opened green 
corridors, first to the northeast then to the southwest. Subsequent drying of the 
homeland corresponds toa sustained effective population size (LOk), whereas wet-dry 
cycles and probable adaptation to marine foraging allowed the southwestern migrants 
to achieve population growth (LOd1’2), as supported by extensive south-coastal 
archaeological evidence®”°. Taken together, we propose a southern African origin of 
anatomically modern humans with sustained homeland occupation before the first 
migrations of people that appear to have been driven by regional climate changes. 


Southern Africa has long been considered to be one of the regions in 
which anatomically modern humans (AMHs) originated. Home to con- 
temporary populations who represent the earliest human lineages, evo- 
lutionary time estimates have largely been based on mitochondrial DNA 
(mitogenomes)!*. The maternal human phylogenetic tree consists of 
two major branches, the extensive L1’6—which includes the out-of-Africa 
ancestral L3 sub-branch (or haplogroup)—and the rare deep-rooting 
LO. The LO lineage is predominated by southern African haplogroups: 
LOd, LOkand the recently described LOg®. By contrast, the rare LOfand 
common LOa lineages are dispersed throughout sub-Saharan Africa!**. 
Through LO pre-screening, we identified 198 southern Africans with 
poorly represented haplogroups for whom the mitogenome was 
sequenced (Supplementary Table 1), allowing for a combined analysis 
of 1,217 mitogenomes (Fig. 1a and Extended Data Table 1). 

We ethno-linguistically classified study participants as KhoeSan— 
southern African populations who traditionally practiced foraging 
and spoke languages containing ‘click’ consonants—or non-KhoeSan 


individuals. Non-KhoeSan who have KhoeSan-derived LO mitogenomes 
are referred to in this study as KhoeSan ancestral, with further geo- 
graphical classification (Fig. 1b and Extended Data Table 2; terminol- 
ogy pertaining to southern African KhoeSan populations is complex 
and contentious, see Methods for further discussion). Contemporary 
KhoeSan include Kalahari KhoeSan (Kx’a, Tuu and central Khoe-Kwadi 
speakers) and west-coastal KnoeSan (Khoe-Kwadi Nama speakers)". 
Peoples who speak Southern Bantu languages, who migrated down 
the east coast of Africa around 1,500 years ago, may have acquired an 
east-coastal KhoeSan heritage”. The arrival of European colonists to 
the Cape in mid-1600s gave rise to the South African Coloured and 
Namibian Baster populations (of Eurasian and indigenous descent), 
who acquired a Cape KhoeSan heritage”. Excluding the east African 
Sandawe and Hadza (whose languages also contain click consonants), 
indigenous KhoeSan populations appear to be absent northeast of the 
Zambeziriver, supported bythe lack of skeletal remains representing the 
KhoeSan-like hunter-forager morphology”. We classified the 198 new 
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Fig. 1| Geographical distribution of 1,217 LO mitogenomes. a, Countries 
within (n=1,139) or outside (n=78) Africa from which LO mitogenomes were 
sourced, including 198 new LO mitogenomes (black numbers on the map). DRC, 
Democratic Republic of the Congo. b, Present-day southern Africa showing the 
geographical distribution of KhoeSan population identifiers defined as 


mitogenomesas Kalahari (n=18), west-coastal (n=21), Cape (n=109) and 
east-coastal (n =29) KhoeSan, or non-KhoeSan (Bantu, n=19), although 
two mitogenomes were classed as unknown. Using these identifiers, 
we provide a best-fit classification for all 1,217 LO mitogenomes (Sup- 
plementary Table 2). 

Phylogenetic analysis confirms the major LO haplogroups, withthe 
exclusion of LOb (Extended Data Fig. 1). Using a subset of 461 mitog- 
enomes, including all of the rare lineages, we establish the coalescence 
times within the LO lineage (Fig. 2a and Supplementary Table 3) and use 
thecomplete dataset to reconstruct geographical dispersals (Fig. 2b). We 
redefine the emergence of the LO lineage to 50-25 thousand years (kyr) 
before previous estimates”, around 200 ka (95% confidence interval, 
240-165 ka). LOd’k (n= 309; coalesced around 187 ka (the number of 
mitogenomes and the coalescence time are provided for each lineage)) 
is largely KhoeSan-specific, emerging approximately 20 kyr before 
the widely dispersed LOa’b’f’g sister branch (n = 152; around 164 ka). 
Although the exact branch resolution for LOk remains undetermined, we 
observe a preference for LOd’k (posterior probability of approximately 
0.6) over LOa’b’f’g’k (posterior probability of about 0.4). Irrespective 
of this, the LOk (n = 113) lineage appears to remain stable for around 
130 kyr before diverging into the Kalahari-specific LOk1 lineage, which 
is predominated by LOk1a (85 out of 94), and rarer LOk1b and LOk2 line- 
ages distributed around the Zambezi river (Extended Data Fig. 2a). The 
LOd lineage remains stable for almost 60 kyr before splitting into the 
KhoeSan-specific LOd1’2 and rarer LOd3 lineages. 

Coalescing around 113 ka, LOd2 (n = 226) emerges approximately 
15 kyr before LOd1 (n=452). Within LOd2 (emerging about 91 ka), LOd2c 
diverged the earliest (n=53; around 84 ka) witha broad and almost even 
KhoeSan-regional distribution (Extended Data Fig. 3 and Supplementary 
Table 4). In 2014, we derived an ancient LOd2clc mitogenome froma 
sample of the skeleton of an approximately 2,330-year-old Cape-coastal 
marine forager (St Helena (StHe)/UCT606)”. Predating archaeologi- 
cal evidence for sheep herding in the region””*, we proposed that this 
LOd2c sub-clade represented a pre-pastoral indigenous southern Afri- 
can lineage. Recently, whole-genome sequencing confirmed a unique 
southern African heritage, whereas two younger (less than 2 kyr old) 
Cape skeletons showed a genetic link to eastern Africa and the associ- 
ated pastoralist migration”. Previously, an overrepresentation of the 
LOd2b (28 out of 44; around 65 ka) and LOd2a (62 out of 118; around 
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KhoeSan (orange), Kalahari and west-coastal; KhoeSan ancestral (green), Cape 
or east-coastal. The Zambezi river provides a geographical division between the 
KhoeSanand mostly non-KhoeSan populationidentifiers. Maps were generated 
inthe R package ‘maps’ v.3.3.0°. 


60 ka) lineages within the Kalahari KhoeSan has been observed; how- 
ever, by doubling the contribution of the LOd2d (6 out of 11) lineage, 
we show a broad southern African distribution (Extended Data Fig. 3 
and Supplementary Tables 5, 6). While LOd1is also spread throughout 
the KhoeSan-regional identifier, we show notable overrepresentation 
of the LOd1b (104 out of 174; about 69 ka) and LOd1c (151 out of 184; 
approximately 59 ka) lineages within the Kalahari and of the LOdla 
(32 out of 91; around 44 ka) lineage within the Cape (Extended Data 
Fig. 4). We contribute two new KhoeSan-ancestral LOd1d mitogenomes 
to the single published mitogenome’®. 

Incontrast to LO1’2, the LOd3 lineage is not specific to southern Africa. 
AlthoughLOd3b (around 30ka) appearsto be KhoeSan-specific, the rarer 
LOd3a (about 42 ka) lineage is exclusively found north of the Zambezi 
river. Notably, three out of six LOd3a mitogenomes were derived from 
east African Sandawe individuals. Our data support previous studies 
that have suggested a genetic link between east Africa and the earliest 
southern Africans”, who last shared a common ancestor around 59 ka. 
By adding a large number of mitogenomes (27 out of 40) to the LOd3 
lineage, we observe overrepresentation of LOd3b in the Cape KhoeSan 
identifier (21 out of 34) (Extended Data Fig. 2b and Supplementary 
Table 7). Using a previously reported identifier that distinguishes mater- 
nal KhoeSan ancestry for the Coloured and Baster populations”, we 
show that the LOd3b lineage is specific to the Coloured population, 
whereas the new LOd2b1a2a sub-clade is specific to the Baster popula- 
tion (Extended Data Fig. 3b). 

Within the LOa’b’f’g lineage, LOfis highly divergent (emerging around 
125 ka; 95% confidence interval,149-101 ka). By including a further five 
LOf mitogenomes, we were able to show that LOf1 (13 out of 27; around 
113 ka) predominates south and LOf2’3 (14 out of 27; about 121 ka) north 
of the Zambezi river (Extended Data Fig. 2c and Supplementary Table 8). 
Within LOf1, we recognize three new branches: the northeast sister clades 
LOfic (Zambian) and LOflb (Tanzanian), andthe South African clade 
LOfla (n=8).Lack of LOf representation within contemporary KhoeSan 
suggests that the presence of LOfla within South Africa is probably 
a result of more recent east-coastal agropastoral back-migration. 
While the LOa’g lineages coalesce around 117 ka (95% confidence inter- 
val, 145-94 ka), contributing 19 southern African to 347 LOa mitoge- 
nomes, we concur that the LOa lineage probably diverged northeast 
of the Zambezi river (around 85 ka) and spread throughout Africa?; the 
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Fig. 2|LO phylogenetic tree, geographical distributions of the major 
southern African LO haplogroup and out-of-homeland LO dispersal routes. 
a, Phylogenetic branching and coalescence times derived froma focused subset 
of 461L0 mitogenomes, including all rare branches, and anchored to 
Neanderthals (Homo neanderthalensis; n=7). The Somalian-derived (Som20) 
LOd3 mitogenome? could not be assigned. b, Geographical distribution 
(identifiers described in Fig. 1b) for all KhoeSan-specific mitogenomes (out of 
1,217): LOd3 (n=40), LOd1’2 (n = 677, excluding one unknown), LOk (n=105, 
excluding seven LOk1b and a single Yemen-derived LOk2), and LOf1 (n =13). 
Predominant geographical representation (shaded regions), with region- 
specific overflow represented by the total number of mitogenomes, including 
the country-specific representation north of the Zambezi river. c, Schematic 
map of southern Africa representing the Makgadikgadi-Okavango palaeo- 


southern representation of the LOalb and LOa2a lineages are probablya 
result of a Bantu back-migration (Extended Data Fig. 5). First described 
inaKx’a-speaking hunter-gatherer®, we now contribute three additional 
and reclassify five published mitogenomes as LOg (Extended Data Fig. 2d 
and Supplementary Table 9). As the LOg lineage has a broad KhoeSan 
and KhoeSan-ancestral distribution, we hypothesize that this lineage 
diverged southwest of the Zambezi river (around 69 ka), similar tothe 
LOd1’2 lineage. 

Our results suggest that the greater Zambezi river basin, particu- 
larly the Kalahari region, had a critical role in shaping the emergence 
and prehistory of AMHs. Nowa semi-desert, this region consists of salt 
pans within northern Botswana that represent desiccated vestiges of 
palaeo-lake Makgadikgadi, which at its peak in the early Pleistocene 
would have been the largest lake in Africa”"®. Contraction of the Mak- 
gadikgadi palaeo-lake during the Middle Pleistocene was accompanied 
by development of the Okavango delta as a result of neotectonic rift- 
ing, which—together with smaller lakes from the upper Zambezi tothe 
Kafue rivers—would have created a vast residual wetland favourable for 
habitation by humans and mammals more broadly” (Fig. 2c). Today, 
the harsh Kalahari climate and oxygen-rich salt pans are not ideal for 
fossil and pollen preservation, respectively. However, period-relevant 
lithic artefacts are documented from the Makgadikgadi pans and sur- 
roundings’”°”1, while palynology suggests that this region was oncea 
grassland and forest biome”. Our data further suggest that the Mak- 
gadikgadi-Okavango palaeo-wetland sustained the existence of AMHs 
for around 70 kyr, supported by mitochondrial data of ancestral giraffe, 
lion and zebra”*>, before out-of-homeland migrations split the founder 
homeland populations of the LOd, LOf and LOa’g lineages. 

Southwest of their homeland, the LOd1’2 lineage experienced 
episodic splits and showed a broad south-coastal occupation of the 
emerged sub-populations, whereas the ancestors of the LOg lineage were 
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wetland sustained AMH homeland (200-130 ka), supported by archaeological 
data (represented by the trowel symbol)’ and genetic wildlife data (represented 
by the lion, zebra and giraffe symbols)”* >. The out-of-homeland migration 
(130-110 ka), results in the split of LOd with LOa’g and LOf divergence. LOd3, LOa 
and LOf migrate ina northeast direction, LOd1’2 and LOg migrate southwest, 
while LOk remains in the homeland. Insets show BSP analyses of effective 
population sizes (EPS) of major LO haplogroups over time, predicting the 
maintenance of the homeland LOk population (orange), population growth for 
the broadly dispersed southwest LOd1’2 migrants (purple), whichis supported 
by archaeological evidence (100-60 ka)*° and the StHe mitogenome’, while 
population growth of the northeast LOa migrants coincides with the out-of- 
Africa migration (aqua). Maps were generated in the R package maps v.3.3.0”. 


less successful. Bayesian skyline plot (BSP) (Fig. 2c) analysis confirms 
effective population growth for the LOd1’2 lineage (BSP LOd1’2), whereas 
extensive archaeological evidence indicates cognitively modern human 
behaviour at the southern tip of Africa® ° between approximately 100 
and 60 ka, together with an associated increase in the density of time- 
appropriate archaeological sites in coastal compared toinland regions”. 
Northeast of their homeland, the LOd3 and LOf lineages are less success- 
ful, whereas the LOa lineage underwent considerable diversification, 
which post-dates the out-of-Africa migration (BSP LOa; Fig. 2c). The 
northeast migration route is further supported by the appearance of 
data-appropriate archaeological sites”®. Within their homeland, the 
population carrying the LOk lineage sustained a constant effective popu- 
lation size (BSP LOk), as did the Kalahari-predominant LOd2b, LOd2a and 
LOdic lineages. Although the presence of LOk in Zambia has been sug- 
gested to represent contact with an ancient pre-Bantu population”, we 
propose that these rare lineages represent an ancient out-of-homeland 
branch of the ancestral KhoeSan population. 

Orbitally driven large-scale hydroclimate variations have been pro- 
posed asa contributor of early human migrations”. In some studies, 
wetter conditions and resulting ‘green corridors’ have been proposed 
to explain the out-of-Africa migration (a ‘pull’ scenario), whereas others 
have proposed that drier conditions and resulting food shortages forced 
dispersals (a ‘push’ scenario)°°. To determine whether our predicted 
homeland isolation and major dispersals may have been driven by cli- 
mate shifts, we analysed four key palaeo-hydroclimate datasets” °°, 
along witha transient 784-kyr-long glacial-interglacial simulation con- 
ducted with the LOVECLIM Earth system model’® (Fig. 3). Although 
limited by available palaeo-proxy records anda climate model of inter- 
mediate complexity, we observe a considerable degree of coherence on 
orbital timescales (Extended Data Fig. 6). During the homeland period 
(200-130 ka), palaeo-data link the 21-kyr-long precession cycle, which 
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Fig. 3| Reconstructed and simulated climatic conditions during the out- 
of-homeland migration. a, Austral summer insolation changes (blue) at 27°S. 
b, Ahydroclimate composite of eastern and central southern Africa (shading) 
was obtained by averaging the Fe/K runoff record from core CD154-1006P*and 
the Pretoria Salt Pan rainfall reconstruction”’, extended from 250 to190ka (grey 
line). The plot shows the effective population size for homeland LOkas analysed 
by BSP (orange dashed lines). c, Southwestern hydroclimate reconstruction 
(shading) obtained by averaging normalized leaf wax data (MD08-3167)* and 
the aridity index from cores (MD96-2094)”, for which the aridity record 


arises froma combination of Earth’s axis wobble anda slow rotation of 
Earth’s entire orbit around the Sun (Fig. 3a), with three wet-dry cycles 
(Fig. 3b). By contrast, the climate model simulates an extended drought, 
owing toamore pronounced eccentricity signal (Fig. 3e), suggestive of 
a wetland oasis in an otherwise vast harsh environment. 

During the out-of-homeland period (130-110 ka), our model simula- 
tion supports humid conditions to the northeast that facilitated the first 
dispersals, concurring with LOf coalescence (around 125 ka) (Fig. 3d). 
By contrast, the region southwest of the homeland experienced an 
approximately 15-kyr-long megadrought before an orbital shift created 
the favourable humid conditions that led to the dispersal of the LOd1’2 
lineage (around 113 ka) (Fig. 3f), whichis also supported by palaeo-data 
(Fig. 3c). This is also around the time the northeast LOa and southwest 
LOg migrants last share acommon ancestor (around 117 ka). During the 
last glacial period (approximately 100-11 ka), we observe areductionin 
the amplitude of the changes in orbital-scale hydroclimate and overall 
drying within the homeland (Fig. 3b), whereas the southwest coastal 
hydroclimate was dominated by precessional variability and showed 
relatively agreeable environmental conditions (Fig. 3c, f). Notably, peri- 
ods of deceleration and acceleration in the estimates of the effective 
population size of the LOd1’2 lineage coincide with regional changes in 
hydroclimate, further linking climate, population size and evolution. 

We propose that the Makgadikgadi-Okavango palaeo-wetland was 
the possible homeland of AMHs. Although one cannot exclude the 
possibility of a polycentric origin™, this deltaic-lacustrine ecosystem 
would have provided an ideal geographical locality for the evolution 
and 70-kyr-long sustained existence of the deepest-branching maternal 
founder population of AMHs. Increased humid conditions, supported by 
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extended from 250 to 140 ka (grey line) and the effective population size of 
LOd1’2as analysed by BSP is shown (purple dashed lines). d, Simulated 
LOVECLIM normalized precipitation changes (shading) northeast of the 
homeland (33° E, 13° S) and coalescence time probabilities for LOf haplogroup 
(blue bell curves). e, Same as for d, but for the homeland and coalescence 
probabilities for LO, LOd’k, LOa’b’f’g (black) and LOk haplogroups (orange). 

f, Same as for d, but for the area southwest of the homeland (17° E, 30° S) and 
LOd1’2 coalescence times (purple). Blue bars indicate predicted Makgadikgadi 
high stand phases*. NE, northeast; SW, southwest. 


palaeo-lakesystem reconstructions”, between130and 110 kawould have 
opened green corridors for successful northeast-southwest migrations, 
supporting a pull scenario. Drying within the homeland following the 
out-of-homeland period, supported by hydroclimate data (110-100 ka) 
anda model simulation (100-80 ka), would have created a pushscenario, 
inwhichareduced carrying capacity of the land would have increased 
pressure to seek out climatically more favourable regions. We propose 
that the southwest migrants maintained a successful coastal forager 
existence, while the northeast migrants—similar to the later-branching 
population of L1’6—gave rise to ancestral pastoral and farming popula- 
tions. A recent publication® provides further mitochondrial evidence 
to support the northeast out-of-homeland migration route and expan- 
sion into eastern Africa around 70-60 ka. Revealing a southern African 
homeland forthe emergence and extended subsistence of the LO lineage, 
we propose that an out-of-homeland migration event, which was prob- 
ably driven by astronomically induced regional shifts in hydroclimate, 
shaped the present-day ethnic and genetic diversity of modern humans. 
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Methods 


No statistical methods were used to predetermine sample size. The 
experiments were not randomized and investigators were not blinded 
to allocation during experiments and outcome assessment. 


Statement on population identifiers 

The authors acknowledge that population identifiers (or ethnic labels) 
have different meanings to different peoples across different countries 
and between and within different ethnic groups. During the apartheid 
rule, South Africans were grouped according to ethnic identities, which 
resulted in discrimination based on population identifiers such as Bantu 
or Coloured. Inturn, others view the very same population identifiers 
with cultural identity and pride. In 2013, we performed a study led by 
a Coloured co-author to assess the sensitivity in self-identification as 
Coloured. Of 521 participants, 91.2% self-identified as Coloured, Cape 
Coloured or South African Coloured, while 8.8% elected against the 
use of Coloured for self-identification™. In turn, using such population 
identifiers within the context of the United States would be seen as 
derogatory and highly offensive. We have previously genetically profiled 
the Baster population of Namibia” and again what could be to others 
a derogatory term, to the Baster community of Rehoboth in Namibia, 
the term is used with immense pride, who recognize themselves as a 
Republic with a national flag’’. 

In this study, the authors have used linguistics, supported by eth- 
nicity, to provide population identification, with further historical, 
geographical and genetic classification for deriving maternal contribu- 
tions (described in the next section). KnhoeSan (or KhoeSaan) languages 
are grouped together due to their use of click consonants as a unique 
language identifier. Once spread across the entire southern African 
region, KhoeSan languages are today restricted largely to populations 
residing in Namibia and Botswana (and southern Angola), although two 
Tanzanian isolates, Sandawe and Hadza, are believed to be linguistically 
related click languages (or east African KhoeSan)”. ‘San’ literally means 
‘forager’ and Khoe means ‘person’; culturally, the KhoiSan identifier 
refers to hunter-foragers (San) or herders (Khoi). At times linguistic 
and cultural identities clash. For example, Nama and Haillom peoples 
both speak Nama (a Khoe-Kwadi language), while culturally and histori- 
cally these two populations are quite different, representing a herder 
and hunter-gatherer ancestry, respectively. Additionally, autosomal 
genetic data have been used to provide further insights into KhoeSan 
admixture and substructures, highlighting at a genetic level the histori- 
cal differences between the Nama and Haillom*°. We have attempted to 
capture bothethnic and linguisticidentifiers that best reflect population 
ancestry. In contrast to KhoeSan languages, most Bantu languages do 
notcontain click consonants; however, exceptions exist within Southern 
African Bantu languages (for example, isiXhosa and isiZulu languages, 
which have borrowed click consonants from their KhoeSan neighbours). 
Spoken across the entire sub-Saharan Africa (up to 500 groups), the 
Guthrie classification of languages further identifies the S-zone or South- 
ern Bantu (South Africa, Zimbabwe, southern Mozambique and most 
of Botswana) and the R-zone or Southwest Bantu languages (northern 
Namibia, southern Angola and northwest Botswana)", which are of 
relevance to this study. 


Ethics statement and recruitment 

The study was performed in accordance with the ethical standards of 
the overseeing human research ethics committees and local govern- 
ance, as per the 1964 Helsinki Declaration. The study was reviewed 
and approved by the Ministry of Health and Social Services (MOHSS) 
in Namibia (17-3-3 2008, 2014 and 2019), with additional local approv- 
als from participating community leaders, the University of Preto- 
ria Human Research Ethics Committee (HREC 43/2010 and HREC 
280/2017), including US Federal-wide assurance (FWA00002567 and 
IRBOO002235 IORGOOO1762), as well as the South African National 


Blood Service (SANBS) HREC (HREC 2012/11). Participants were 
recruited within the borders of Namibia and South Africa and self- 
reported ethno-linguistic population identifiers were recorded. Blood 
samples were taken after receiving written and/or recorded informed 
consent. Isolated DNA was shipped under the Republic of South Africa 
Department of Health Export Permit (J1/2/4/2), in accordance withthe 
National Health Act 2003, to the Garvan Institute of Medical Research 
in Australia. Mitogenome sequencing was performed in accordance 
with site-specific approval granted by St Vincent’s Hospital HREC in 
Australia (SVH 15/227). 


Participant population identifiers 

Merging with published data for a total of 1,217 LO mitogenomes, par- 
ticipants were broadly classified as KhoeSan, Bantu or Cape multi- 
ethnic heritage. Indigenous KhoeSan who inhabit the inland semi-desert 
Kalahari region of Botswana and Namibia include the Kx’a (Ju|’hoan 
or Hoan, and !Xun or !Xuun), Tuu (or Taa) and Khoe-Kwadi (Naro, 
||Ani, Khwe, Buga, Gllana, Gllui, ||[Xokhoe, Tshwa and Shua) speakers. 
Indigenous KhoeSan who inhabit the west-coastal region of Namibia 
speak a Khoe-Kwadi or Nama language and include the Nama, Damara, 
Topnaar (+Aonin) and Hail|om speakers*”**. Novel mitogenomes were 
derived from 15 Kalahari KhoeSan, including Jul|’hoan (n= 9), !Xun (n=1) 
and Naro (n=5), and 21 west-coastal KhoeSan, including Nama (n=7), 
Damara (n = 8) and Topnaar (+Aonin, n = 6) from Namibia. Speakers 
of southwest Bantu (non-KhoeSan) languages (which do not contain 
click consonants) of Namibia, Botswana and southerly boarders of 
Angola, presenting with KhoeSan-predominant LO maternal lineages, 
most likely carry a Kalahari or west-coastal KhoeSan mitogenome. Asa 
result of refuge provided to the Herero by the Kalahari KhoeSan during 
the early 1900 German South West African genocide“, we speculate in 
this study a probable Kalahari KhoeSan heritage for the three Herero 
mitogenomes. 

Although indigenous KhoeSan are arguably absent from the coastal 
regions of South Africa, and while recognising and honouring the north- 
west inland (southern Kalahari) +Khomani San of South Africa (although 
not recruited within the context of this study), KhoeSan skeletal remains 
spread across the region*. Hunter-gatherer KhoeSan once inhabiteda 
broad southwest to east-coastal region at the tip of Africa. These skel- 
etal remains predate archaeological evidence supporting the arrival 
of sheep herders who appear to have crossed the Okavango river in 
northern Namibia around 2.2 ka, migrating along the southwest coast 
to the southern Cape”?°?°* by around 2 ka. Recently, Cape KhoeSan 
skeletons younger than 2 ka have been genetically linked to east Africa 
and herder migration”. Migrating herders may have acquired indig- 
enous KhoeSan maternal contributions. Alongthe east coast, southward- 
migrating Bantu farmers (Southern Bantu; who presumably did not 
speak languages containing click consonants) entered South Africa 
around 1,500 years ago, while a second wave of Bantu migrants (South- 
west Bantu) crossed central Africa into Namibia around 800 years ago”. 
Maternal contributions to the South African Southern-Bantu-speaking 
populations (n= 43, this study) may therefore either be of Bantu origin 
(inthis case, LOa lineages and therefore non-KhoeSan) or of east-coastal 
KhoeSan-ancestry. The arrival of European colonists and Dutch-East- 
Indian slaves to the Cape in the mid-1600s, gave rise to a multi-ethnic 
(European, Asian, KhoeSan and Bantu) Cape population, the ancestors 
of the South African Coloured (n= 90, this study) and Namibian Basters 
(n= 24, this study), who historically speak a Dutch-derived language 
knownas Afrikaans**. Emerging fromacommon historical background 
to the Coloured, the Baster population have since the late 1800s distin- 
guished themselves as independent from the Coloured, migrating to 
the Baster nation of Rehoboth in Namibia’. Although the vast majority 
of LO mitogenomes represented in the Baster and Coloured popula- 
tions are of Cape KhoeSan heritage (100% and 94.4%, respectively), we 
observe a percentage of non-KhoeSan (Bantu) LOa lineages within the 
Coloured population. 


LO haplogroup pre-screening 

Subjects were selected for whole-mitogenome sequencing based on 
pre-screening for specific LO markers using direct amplicon-specific 
Sanger sequencing. Specifically, a 2,673-bp region (Cambridge Refer- 
ence Sequence (rCRS) position 3322-5995) was amplified and initially 
screened for the LO variant T5442C. LO samples were further screened to 
delineate LOd (T4232C), LOd1(G3438A),_LOd1b (T3618C), LOd1c(C4197T), 
LOd1’2 (A3756G), LOd2 (A3981G, C205T, A4044G), LOd2a (A5S153G), LOd2d 
(G5147A, G5231A), LOd2C (A4038G, T4937C) and LOd3 (G5460A, G5773A) 
lineages. This identified 188 samples carrying a rare LO haplogroup: 
LOd1b (n=21), LOd1c (n=13),LOd2a (n=30), LOd2b (n=7), LOd2c (n=15), 
LOd2d (n=6), LOd3 (n=29), LOal (n=6), LOa2 (n=6), LOf (n=5) andLOk 
(n=5);as wellas 55 samples that could not be unambiguously assigned 
toa major LO sub-lineage: LOd1a’c (n= 2), LOa’b’f’k (n=5), LOa’b (n=2), 
LOd2 (n=1) andLOd1 (n=45, assumed LOd1a) (Supplementary Table 1). 


Whole-mitogenome sequencing 

Mitogenomes were isolated using two overlapping amplicons as previ- 
ously described°*“*. Specifically, two primer pairs were used to isolate 
and amplify fragments 12,250-3,005 (7.2 kb) and 2,583-12,337 (9.7 kb) 
of the circular mitogenome. This pair of primers has been demonstrated 
to effectively capture the mitogenome with high specificity while mini- 
mizing off-target capture of nuclear copies of mitochondrial-derived 
DNA. Following touchdown long-range amplification with the Platinum 
Taq DNA Polymerase High Fidelity (Invitrogen), the two amplicons were 
purified using AMPure XP beads (Agencourt) and combined in a 7:13 
ratio of short:long fragments. Sequencing was performed on the Ion 
Torrent PGM platform. In brief, 200-bp single-end sequencing libraries 
were prepared using the Ion Xpress Plus Fragment Kit and lon Xpress 
Barcode Adaptors (ThermoFisher), and 4-16 samples (barcodes) were 
pooled and sequenced on 314v2 Ion Chips. Using the Ion Torrent suite 
v.5.0.2.1, sequencing reads were quality trimmed and aligned to the 
human mitochondrial revised rCRS (accession NC_012920.1). Consensus 
mitogenome sequences were derived by first identifying variants relative 
torCRS, using samtools (v.1.3.1) mpileup (with parameters -d10000-L 
1000 -Q7-h50-010-e17-m4)” and bcftools (v.1.3.1) call (with param- 
eters -c-M) (http://www.htslib.org/doc/bcftools.html), then converting 
to the FASTA format using the vcfutils.pl vcf2fq program in samtools. 


Publicly available data 

An exhaustive search for publicly available LO mitogenomes was per- 
formed between 2015 and 2017, identifying 26 studies comprising a total 
of 6,334 mitogenomes. LO status for all mitogenomes was deduced, either 
directly from the original publication or by downloading the nucleotide 
sequences from NCBland evaluating their haplogroup using HaploGrep2 
(v.2.1.13)°° based on PhyloTree Build 17°. From this dataset, a subset of 
1,019LO mitogenomes was identified and includedin this study (Extended 
Data Table1and Supplementary Table 2). Publicly available genomes were 
broadly classified as KhoeSan, Bantu (KhoeSan ancestral), ornon-KhoeSan 
based on the reported population and/or country of origin. 


Whole-mitogenome haplotyping 

HaploGrep2” was used to type all 1,217 sequences against PhyloTree 
Build 17". This resulted in the refinement and reclassification of our 198 
mitogenomes, resulting in LOd1 (n = 81, including 45 LOd1a, 21 LOd1b, 
13 LOdic and 2 LOd1d), LOd2 (n=58, including 30 LOd2a, 8 LOd2b, 14 
LOd2c and 6 LOd2d), LOd3 (n=27), LOa (n=19), LOf (n=5), LOk (n=5) 
and LOg (n =3) mitogenomes (Supplementary Table 1). This refined, 
and in some cases reclassified, the haplogroups of the 1,019 publicly 
available mitogenomes (Supplementary Table 2). 


Phylogenetic inference 
Multiple sequence alignment was performed across all 1,217 mitog- 
enomes along with 7 Neanderthal genomes (Supplementary Table 10), 


using MUSCLE v.3.8.31” with parameters -maxiters 3 -diags1. Phylo- 
genetic inference was performed using FastTree v.2.1.7 (SSE3)? using 
the generalized time reversible (-gtr) and discrete gamma model with 
20 rate categories (-gamma). Asummary of the inferred phylogenetic 
tree is shown in Extended Data Fig. 1, with the tree rerooted to the 7 
Neanderthal genomes. 

Bayesian phylogenetic inferences and divergence times were calcu- 
lated using BEAST2 v.2.4.2 with BEAGLE v.2.0*. Owing to the compu- 
tational burden of this analysis, BEAST was performed ona subset of 
461 mitogenomes, selected to include: (i) only complete mitogenomes 
(27 mitogenomes with only the coding region®”* were excluded); (ii) all 
198 novel mitogenomes from this study; (iii) all 121 LO mitogenomes 
from our previous studies, Chan et al.° (n = 77), Morris et al.’ (StHe, 
defining the new haplogroup LOd2clc), McCrow et al.*8 (n = 37) and 
Schuster et al.*” (n= 6); (iv) all rare haplogroups, namely LOg (n= 9), LOf 
(n=22),L0d3 (n=30), LOd1d (n=3), LOd2d (n=11) and LOk2 (n=12); (v)all 
mitogenomes that could not be unambiguously typed by HaploGrep2 
(n=14;none from this study); and (vi) arandom subset of mitogenomes 
for all remaining sub-lineages not already represented. 

Multiple sequence alignment of the subset of 461 AMH and 7 Nean- 
derthal mitogenomes was converted to NEXUS format using the con- 
vert function of seqmagick v.0.6.1 (https://fhcre.github.io/seqmagick) 
with parameter --alphabet dna-ambiguous. This provided the input 
to BEAST2. Specifically, BEAUTi v.2.4.2 was used to set up the phylo- 
genetic model, assuming: (i) the gamma site model with six gamma 
categories and no invariant sites; (ii) the generalized time-reversible 
substitution model; (iii) astrict constant clock model witha normal prior 
with w=1.665 x 10° and o=1.479 x10’ based ona previously published 
study*; and (iv) acoalescent constant population. Times were calibrated 
tothe seven H. neanderthalensis mitogenomes with tip dates set to their 
reported approximate archaeological dating estimates: Feldhofer 1, 
40 ka®; Vindija, 38 ka®; El Sidron, 39 ka’; Feldhofer 2, 40 ka°’; Mezmais- 
kaya, 65 ka®’; Croatia, 38.31 ka; Altai, 50 ka“! (Supplementary Table 10). 
Noprior was set on the most recent common ancestor of this taxon set, 
and calibration was applied to the leaves instead of the most recent 
common ancestor. Further, anormal prior, Mju= 200,000, 0=50,000), 
was Set on the coalescent time of the AMH genomes, and a tip date of 
2,330 years before present was set for the StHe genome’. 

Five BEAST replicates were performed, each with 100 million Markov 
chain Monte Carlo iterations, sampling every 10,000. Tracer v.1.6 was 
used to evaluate BEAST trace files (Supplementary Table 11), ensuring all 
runs had converged. The five replicates were combined using LogCom- 
biner v.2.4.2, discarding 10% of the samples as burn-in for each replicate 
and without resampling states at a lower frequency. 

Sampled trees from BEAST were summarized into a single maximum 
clade credibility target tree using TreeAnnotator v.2.4.2 for each of the 
five replicates, discarding the first 10% as burn-in. To summarize across 
replicates, sampled trees from the five replicates were first combined 
using LogCombiner v.2.4.2, again discarding the first 10% as burn-in 
from each replicate, but resampling at a lower frequency of 50,000 
(five replicates of 10,000 samples). The combined, resampled trees 
were then summarized with TreeAnnotator v.2.4.2 as for the individual 
replicate BEAST results. 

FigTree v.1.4.2+ (http://tree.bio.ed.ac.uk/software/figtree/) was used 
to visualize all resulting trees. 


BSP analysis 
BSP analyses were performed to estimate the demographic history 
of each maternal haplogroup. Although maternal haplogroups do 
not necessarily equate to population groups, it has been suggested 
that the signal associated with a haplogroup can still provide insights 
into the demographic processes in the populations who carry the 
haplogroup®. 

For each haplogroup of interest (for example, LOa, LOd1’2 and LOk), 
anexus file was derived using SeqMagic v.0.6.1 as described above. 
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BSP analyses were performed using BEAST2, using BEAUTIi 2 for model 
setup as described in ‘Phylogenetic inference’, with the following key 
differences: (i) the gamma shape of the gamma site model was esti- 
mated with an exponential prior with mean = 1.0 and offset = 0.0; 
(ii) the molecular clock was fixed (not estimated) at 1.665 x 10° based 
ona previously published study*®; and (iii) the phylogenetic tree prior 
was set to coalescent Bayesian skyline, assuming 20 intervals between 
the root of the tree and the present time. 

Tracer v.1.6 was used to reconstruct the Bayesian skyline from the 
sampledtrees for each analysis, using a stepwise constant variant andthe 
lower 95% highest posterior density of the root height as the maximum 
time. Results of this analysis are summarized in Supplementary Table 12. 


Geographical history of the palaeo-wetland Makgadigadi 
Initiated around 2 million years ago, palaeo-lake Makgadikgadi’ origi- 
nally covered an area of around 170,000 km at its highest lake stand, 
bounded by a shoreline of around 995 m. A degraded sand ridge (the 
Deception ridge), was associated with the 995-m shore in the southwest 
of the lake. This lake would have covered more than twice the area of 
modern Lake Victoria, and similar to the latter, would have caused acon- 
siderable climatic feedback, withlocally enhanced rainfall. We previously 
proposed that this was, inturn, responsible for the initiation of the sur- 
rounding (now-fossil) drainages, creating a well-watered environment 
and very favourable habitat for mammals, including hominids’. Smaller 
lakes, now represented by residual wetlands, also formed onthe upper 
Zambezi river and the modern Kafue Flats on the Kafue river, resulting 
inan archipelago of palaeo-lakes in south-central Africa during the Early 
and Middle Pleistocene epoch. 

Palaeo-Makgadikgadi, bounded by the 995-m shoreline, was originally 
sustained by a major drainage line, which included the Chambeshi river 
as headwaters, connected to the upper Zambeziriver viathe upper Kafue 
river. Severance of the original links between the Chambeshi river and 
upper Kafue river, and the latter and the upper Zambezi river resulted 
inasequential contraction of the Makgadikgadi toa much smaller water 
body. This is reflected in a series of fossil shorelines, associated with 
breaks in slope, at progressively lower levels (945 m, 936 m and 922m). 
The Gidikwe ridge was associated with the 945-m shoreline. However, 
contraction of the lake was accompanied by the development of the 
modern Okavango delta. Timing of the contraction of the lake and 
initiation of the Okavango delta is not tightly constrained, but by the 
time that we propose that modern humans emerge within the region, 
around 200 ka, we speculate that the formerly extensive Makgadikgadi 
palaeo-lake had contracted to a much less extensive deltaic-lacustrine 
system. Together with the lakes that developed from the upper Zambezi 
and Kafue rivers to the north and the Okavango delta to the west, the 
region would have been a vast wetland, a favourable habitat for homi- 
nid occupation. It is this palaeo-wetland region that we propose as the 
homeland for the founder population of AMHs. 


Climate model simulations and palaeo-climate data 

To place the coalescence time estimates of the LO branch into a cli- 
matic context and to test the robustness of simulated hydroclimate 
responses in South Africa to orbital-scale conditions, we use the LOVE- 
CLIM Earth system model of intermediate complexity”. It is based on 
a3-layer atmosphere, a 20-level ocean general circulation model, a 
dynamic-thermodynamic sea-ice model and aterrestrial vegetation 
model. A transient simulation that covers several glacial—interglacial 
cycles was conducted using time-dependent boundary conditions. 
The experiment’® (covering the past 784 kyr) uses time-varying bound- 
ary conditions for orbital parameters, CO, and other greenhouse gas 
concentrations obtained from Antarctic ice cores, and an estimate of 
Northern Hemispheric ice-sheet orography and albedo changes (data 
are used in Fig. 3 and Extended Data Fig. 6). The forcings are applied with 
anacceleration factor of five: one coupled model year corresponds to 
five orbital calendar years. Our analysis focuses on the past 250 kyr in 


both simulations. The climate sensitivity of this model to CO, variations 
was modified to capture the range of reconstructed global mean surface 
temperature changes in response to radiative forcing™. The transient 
LOVECLIM model simulations have previously been validated against 
other palaeo-climate records from around the world”****, Our analysis 
here focuses on the simulated precipitation as well as changes in tree 
and grass fractions in central eastern Africa and western southern Africa 
(data used in Fig. 3 d-f and Extended Data Fig. 6b-d). 

As aresult of its coarse horizontal atmospheric resolution (5.6°) and 
the use of only parameterized ageostrophic wind components, LOVE- 
CLIM has several deficiencies. Of particular note are the lack of realistic 
EI Nino-Southern Oscillation variability and the fact that annual mean 
freshwater flux corrections have been applied to mimic the atmospheric 
moisture transport fromthe Atlantic to the Pacific and to stabilize the 
Atlantic Meridional Overturning Circulation. 

There exist only a few long-term hydroclimate datasets from southern 
Africa that cover the past >120 kyr. Here we compare the simulated LOVE- 
CLIM precipitation (normalized) in central southern Africa with a south- 
erncentral African hydroclimate composite, obtained by averaging the 
normalized orbitally tuned rainfall reconstruction from the Pretoria salt 
pan”’ and the normalized Fe/K river run-off proxy obtained from marine 
sediment core CD154-1006P* (Fig. 3b). The composite index emphasizes 
the joint variability in both records. We find that some of the overall fea- 
tures in the observations—particularly the fact that rainfall is modulated 
by the precessional cycle of austral summer insolation® (Fig. 3a)—are 
well-captured by the LOVECLIM model experiment. However, we also find 
some discrepancies inthe central part of southern Africa, suchas inthe 
phase of the precessional signal and the difference in overall wet and dry 
conditions during the homeland period from 200 to 120 ka. The overall 
glacial drying inthe central part of southern Africa from 100 to 20 kais, 
however, captured in both model simulation and palaeo-proxy recon- 
structions (Fig. 3b, e). Orbital-scale hydroclimate variations in southern 
Africa are clearly not spatially homogenous (Fig. 3b-f). To gain a better 
understanding of the spatial patterns of hydroclimate variability, wecom- 
pared the model simulation with a composite index from southwestern 
Africa, obtained by averaging a normalized aridity index reconstructed 
from sediment core MD96-2094* and the normalized 6“C isotope ratio 
data of leaf wax extracted from the South Atlantic sediment core MDO8- 
3167” (Fig. 3c and Extended Data Fig. 6c, d). The results show a good 
correspondence between modeland reconstructions onthe western side 
of southern Africa, and in particular reproduce a major drought period 
that peaked around 120 ka and asubsequentincrease in rainfall towards 
the last glacial period. This gradual increase in rainfall corresponds to 
an overall increase in lineage splitting of the LOd1’2 haplogroup (Fig. 3f) 
and growth of its population (Fig. 3c). This result further highlights the 
possibility that climate shifts may have played an important part inthe 
southwestward migration of LOd1’2 descendants (Fig. 2). 

To further test the fidelity of LOVECLIM in reproducing interhemi- 
spheric orbital rainfall shifts across Africa, we also compared the simu- 
lated vegetation changes with a leaf-wax index from stable hydrogen 
isotope data extracted from asediment core in the Gulf of Aden”, which 
is indicative of hydroclimate and vegetation changes in the northeast- 
ern Horn of Africa (Extended Data Fig. 6b). The comparison shows a 
good qualitative correspondence for the precessional-scale timing of 
rainfall and vegetation maxima and minimaas well as of the eccentric- 
ity modulated amplitude of these changes, lending further support to 
the credibility of the simulated rainfall patterns across Africa. It should 
be noted that regional patterns of paleo-rainfall changes are in general 
difficult to simulate. In response to Last Glacial Maximum boundary 
conditions, different coupled general circulation models simulate widely 
varying responses in rainfall over Africa’®. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


The consensus sequences for this set of 198 mitogenomes have been 
deposited in the NCBI GenBank with accession numbers MK248274— 
MK248471. Requests for materials should in the first instance be 
addressed to V.M.H. 
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Extended Data Fig. 1| Phylogenetic tree of all 1,217 LO mitogenomes. support values for each node are indicated and branch lengths are proportional 
Phylogeny was inferred using FastTree v.2.1.746, displayed using FigTree. Tips tothe number of substitutions per site. The treeis rooted to the seven 
belonging tothe same haplogroupare collapsed and coloured asinFig.2a.Local Neanderthal mitogenomesas indicated. 
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Extended Data Fig. 2 | Detailed phylogenetic branching of LOk, LOd3, LOf 
andLOg.a-d, Expanded sections of the phylogenetic tree depicted in Fig. 2aare 
shown, including 34 (out of a total of 113) LOk (a), all 40 LOd3 (b), all 27 LOf (c) and 
all 9 LOg (d) mitogenomes. Each mitogenome is represented as atip and 
coloured based on their broad ethno-linguistic classification, if known. KhoeSan 
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is shown in orange, non-KhoeSan in grey and Cape multi-ethnic (KhoeSan 
ancestral) in green. Publicly available mitogenomes for which we cannot be 
certain of their broad population identifier are labelled in black font. Proposed 
new sub-lineages for LOd3, LOf and LOg1are indicated by red-coloured node 
labels and are further described in Supplementary Tables 7-9. 
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Extended Data Fig. 3 | Detailed phylogenetic branching of LOd2. only 13 LOd2b. The same model parameters were used for both data subsets. In 
a,c, d, Expanded branches of the phylogenetic tree depicted in Fig. 2a are all panels, each mitogenome is represented as a tip and coloured based on their 
shown, including 51 (out ofa total of 118) LOd2a (a), 25 (out of 53) LOd2c(c)andall broad ethno-linguistic classification, as in Extended Data Fig. 2. The previously 
11L0d2d (d) mitogenomes. b, For LOd2b, an additional BEAST analysis was defined LOd2cl1c haplogroup, containing the coastal KhoeSan StHe skeleton® 
performed using an alternate subset of 441 mitogenomes that included all and other newly proposed sub-lineages are indicated by red node labels 

43 LOd2b samples, as opposed to then=461 subset (Fig. 2a) that included (Supplementary Tables 4-6). 
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Extended Data Fig. 4 | Detailed phylogenetic branching of LOd1. 
a-c, Expanded branches of the phylogenetic tree depicted in Fig. 2a are shown, 
including 54 (out ofa total of 91) LOd1a (a), 45 (out of 174) LOd1b (b) and 33 (out of 
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Data Fig. 2. 
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Extended Data Fig. 5 | Detailed phylogenetic branching of LOa. The LOa represented as tips and coloured based on their broad ethno-linguistic 
branch of the phylogenetic tree displayed in Fig. 2ais shown, which includes a classification as in Extended Data Fig. 2. 
subset of 114 (out of a total of 294) LOa mitogenomes. Each mitogenome is 
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Extended Data Fig. 6 | Comparison of the palaeo-data and palaeo-model. 

a, Locations of key sites that are used for the comparison of the palaeo-model 
and palaeo-data in this study are highlighted in red. The map was generated in 
Paraview v.5.6 (https://www.paraview.org/). b, Simulated tree fraction (%) at 
Horn of Africa (land grid points nearest to RCO9-166) (grey, dark-blue bars) and 
stable hydrogen isotopic composition of leaf wax, corrected for ice volume 
contributions from the Gulf of Aden marine sediment core RCO9-166” (orange), 
indicating changes in hydroclimate.c, Relative precipitation changes (%) 
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simulated by LOVECLIM transient model (all forcings) for 11° E, 19° S (grey, dark- 
blue bars) and grain-size aridity index reconstructed from sediment core MD96- 
2094” (orange). d, Grass fraction changes simulated by LOVECLIM transient 
model (all forcings) at 11° E, 14-17° S (grey, dark-blue bars) and reconstructed 
6°C changes of n-alkanes (orange) (South Atlantic sediment core MD08-3167) 
indicative of abundance of C, and C, plants in the Namibian desert and further 
inland®. 
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Extended Data Table 1| LO mitogenomes included in this 
study 


Number of LO 


Study Reference Mitogenomes 
Barbieri et a/., AJHG 20137” 485 
Barbieri et a/., AJPA 2014°° 26 
Barbieri et a/., EJHG 2013 33 
Barbieri et a/., MBE 2012°” 10 
Barbieri et a/., PlosOne 2014 115 
Batini et a/., MBE 2011°° 9 
Behar et al., AJHG 2008' 64 
Chan et al., Plos One 2015° 77* 
Eaaswarkhanth et al., EJHG 2009”° 3 
Gonder et a/., MBE 2007”' 27 
Herrnstadt et a/., AJHM 2002°° 2t 
Horai et al., PNAS 199572 1¢ 
Ingman et a/., Nature 20007° 5 
Just et a/., FSIG 2008”4 8 
Kivisild et al., Genetics 2006 25t 
Kujanova et a/., AJPA 2009”° 4§ 
Maca-Meyer et a/., BMC Genet 200176 1 
Macaulay et al., Science 2005”” 1 
Margaryan et al., Curr Biol 201778 1 
McCrow et al., Prostate 201647 37* 
Morris et al., GBE 2014"° 1* 
Olivieri et al., MBE 20177° 4 
Rito et al., Plos One 2013° 42 
Schuster et a/., Nature 2010°° 6* 
van der Walt et a/., EJHG 2012°° 20 
Vyas et al., AJPA 2016°' 12 
Total Public 1,019 
This Study 198 
Total LO Mitogenomes 1,217 


d°"S studies. 


Numbers of mitogenomes taken from previously publishe 
*Previously published data by our group with verified population metadata. 
‘Mitochondrial DNA sequences of the coding-region only. 

“Sequence has non-canonical start position corresponding to position 577 of rCRS. 


5Coriell cell lines. 


Extended Data Table 2 | KhoeSan population identifiers used in this study 


Broad identifier Geographic Broad language Ethno-linguistic identifiers 
identifier group 


KhoeSan 


KhoeSan ancestral 
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For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section. 


n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


[| A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


O A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 
AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
“—! Give P values as exact values whenever suitable. 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Our web collection on statistics for biologists contains articles on many of the points above. 


Software and code 


Policy information about availability of computer code 


Data collection The nucleotide sequences of 1,019 LO mitogenomes were downloaded from NCBI 
Data analysis All software used is described and referenced in the METHODS section. No new software algorithms or code was established. These 
include: 


Mitogenomic and phylogenetic analyses: 

Consensus mitogenome sequences were derived by first identifying variants relative to rCRS using samtools (v1.3.1) mpileup (with 
parameters -d 10000 -L 1000 -Q 7 -h 50 -o 10 -e 17 -m 4) and bcftools (v1.3.1). Variant data was converted to fasta format using 
samtools’ vcfutils.p! vcf2fq. Mitochondrial haplogroups were evaluated (called) using HaploGrep2 (v2.1.13) based on PhyloTree Build 17. 
Multiple sequence alignment (for phylogenetic inference) was performed using FastTree v2.1.7 (SSE3) using the generalised time 
reversible (-gtr) and discrete gamma model with 20 rate categories (-gamma). Bayesian phylogenetic inferences and divergence times 
were calculated using BEAST2 v2.4.2 with BEAGLE 2.0. Multiple sequence alignment of the subset of 461 AMH and seven Neanderthal 
mitogenomes was converted to NEXUS format using the convert function of seqmagick v0.6.1 (https://fherc.github.io/seqmagick) with 
parameter --alphabet dna-ambiguous. This provided the input to BEAST2. Specifically, BEAUTi v2.4.2 was used to set up the phylogenetic 
model, assuming: (i) the Gamma Site Model with 6 gamma categories and no invariant sites; (ii) the generalized time reversible 
substitution model; (iii) a strict constant clock model with a normal prior of with = 1.665 x 10-8 and o = 1.479 x 10-9 based on Soares et 
al. 200960; and (iv) a Coalescent Constant Population. Five BEAST replicates were performed, each with 100 million MCMC iterations, 
sampling every 10,000. Tracer v1.6 was used to evaluate BEAST trace files. Replicates were combined using LogCombiner v2.4.2. 
Sampled trees from BEAST were summarized into a single Maximum Clade Credibility target tree using TreeAnnotator v2.4.2 for each of 
the five replicates, discarding the first 10% as burn-ins. FigTree v1.4.2+ was used to visualize all resulting trees. 


Bayesian Skyline Plot (BSP) analysis: For each haplogroup of interest (e.g. LOa, LOd1’2, and LOk), a nexus file was derived using SeqMagic 
v0.6.1 as described above. BSP analyses were performed using BEAST2, using BEAUTIi 2 for model setup as before, with the following key 
differences: (i) the gamma shape of the Gamma Site Model was estimated with an exponential prior with mean = 1.0 and offset = 0.0; (ii) 
the molecular clock was fixed (not estimated) at 1.665 x 10-8 based on Sores et al.60; and (iii) the phylogenetic tree prior was set to 


Coalescent Bayesian Skyline, assuming 20 intervals between the root of the tree and the present time. Tracer v1.6 was used to 
reconstruct the Bayesian Skyline from the sampled trees for each analysis, using a stepwise constant variant and the lower 95% highest 
posterior density of the root height as the maximum time. 


Climate model: LOVECLIM earth system model 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. 
We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 
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Policy information about availability of data 
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 
- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- Adescription of any restrictions on data availability 


The consensus sequences for this set of 198 mitogenomes have been deposited to NCBI with Accession Numbers MK248274-MK248471. 


Field-specific reporting 


Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection. 


[X] Life sciences [_] Behavioural & social sciences [| Ecological, evolutionary & environmental sciences 


For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf 


Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size No sample size calculation was performed as all mitogenomes representing a rarer LO haplogroup were included. Pre-screening was 
performed on roughly 500 population relevant samples, identifying 198 of interest to undergo complete mitogenome sequencing and 
inclusion in this study. All relevant published data was also downloaded for a total study of 1,217. Sample sizes are therefore ALL inclusive (no 
rare LO lineage and mitogenome was excluded) and therefore sufficient. 


Data exclusions All 1,217 mitogenomes were used to determine haplogroup frequencies and reconstruct geographic dispersals. A focused subset of 461 
mitogenomes, including all rare lineages, were used to establish within LO coalescence times. The smaller subset was necessary due to the 
computational burden of the BEAST analysis. 


Replication LO haplogroup pre-screening (2,673 bp region) using Sanger sequencing for the 198 mitogenomes that underwent lon Torrent deep complete 
mitogenome sequencing, providing an internal experimental validation. All samples were validated. 


Randomization — Not relevant to this study as grouping (haplogroups) were identified via mitogenomic data. KnhoeSan geographic classification were based on 
self-identification (ethnic classifications) of participating subjects and country of origin. 


Blinding Investigators were blinded to the subject identifiers during analysis, as were the climate physicists blinded to the hypotheses and distributions 
associated with the mitogenomic data. 


Reporting for specific materials, systems and methods 


We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, 
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 


Materials & experimental systems Methods 
n/a | Involved in the study n/a | Involved in the study 
Antibodies ChIP-seq 
Eukaryotic cell lines Flow cytometry 
Palaeontology MRI-based neuroimaging 


Animals and other organisms 


Human research participants 
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Palaeontology 


Specimen provenance NA - no specimens were collected for the purpose of this study - only published data used. 
Specimen deposition NA - the study involved published data 
Dating methods No new specimens or dates are provided or used. 


Tick this box to confirm that the raw and calibrated dates are available in the paper or in Supplementary Information. 


Human research participants 


Policy information about studies involving human research participants 


Population characteristics Southern Africans from Namibia and South Africa representing and self-identifying as ancestrally from a KhoeSan or KhoeSan 
ancestral population identifiers (as outlined within the METHODS) were recruited. There was no gender bias and all participants 
were greater than 18 years of age as per ethical requirements. 
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Recruitment Participants were recruited based on their self-identified ethnicity. Recruitment was led by local researchers and coauthors 
(MSRB, DCP or VMH in South Africa) or in Namibia (HATF, as well as VMH). The study was explained to the participants and 
communities and in some communities, especially contemporary Namibian Kalahari KhoeSan populations, this took place over 
an extensive period, with regular engagement with the communities by VMH over a 10 year period. In these communities 
recruitment was a decision made by the entire community. All recruiters are familiar with local languages and cultures. There 
was no other biases that would impact this study. All possible populations that may carry a LO mitogenome and live within the 
borders of Namibia and South Africa and were willing to participate, were included. 


Ethics oversight The study was reviewed and approved by the Ministry of Health and Social Services (MOHSS) in Namibia (#17-3-3 2008, 2014 and 
2019), with additional local approvals from community leaders, the University of Pretoria Human Research Ethics Committee 
(HREC #43/2010 and HREC #280/2017), including US Federal-wide assurance (FWA00002567 and IRBO0002235 IORGOO01762), 
as well as the South African National Blood Service (SANBS) HREC (HREC #2012/11). Isolated DNA was shipped under the 
Republic of South Africa Department of Health Export Permit (#J1/2/4/2), in accordance with the National Health Act 2003, to 
the Garvan Institute of Medical Research in Australia. Mitogenome sequencing was performed in accordance with site-specific 
approval granted by St Vincent’s Hospital HREC in Australia (SVH 15/227). 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 
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Human achievements are often preceded by repeated attempts that fail, but little is 
known about the mechanisms that govern the dynamics of failure. Here, building on 
previous research relating to innovation’ ’, human dynamics 


Sand learning” ”, we 


develop a simple one-parameter model that mimics how successful future attempts 
build on past efforts. Solving this model analytically suggests that a phase transition 
separates the dynamics of failure into regions of progression or stagnation and 
predicts that, near the critical threshold, agents who share similar characteristics and 
learning strategies may experience fundamentally different outcomes following 
failures. Above the critical point, agents exploit incremental refinements to 
systematically advance towards success, whereas below it, they explore disjoint 
opportunities without a pattern of improvement. The model makes several 
empirically testable predictions, demonstrating that those who eventually succeed 
and those who do not may initially appear similar, but can be characterized by 
fundamentally distinct failure dynamics in terms of the efficiency and quality 
associated with each subsequent attempt. We collected large-scale data from three 
disparate domains and traced repeated attempts by investigators to obtain National 
Institutes of Health (NIH) grants to fund their research, innovators to successfully exit 
their startup ventures, and terrorist organizations to claim casualties in violent 
attacks. We find broadly consistent empirical support across all three domains, which 
systematically verifies each prediction of our model. Together, our findings unveil 
detectable yet previously unknown early signals that enable us to identify failure 
dynamics that will lead to ultimate success or failure. Given the ubiquitous nature of 
failure and the paucity of quantitative approaches to understand it, these results 
represent an initial step towards the deeper understanding of the complex dynamics 


underlying failure. 


To understand the dynamics of failure, we collected three large-scale 
datasets (Supplementary Information1). The first dataset (D,) contains 
all ROI grant applications submitted to the NIH (776,721 applications 
by 139,091 investigators, 1985-2015; Supplementary Information 1.1). 
For each grant application, we obtained ground-truth information on 
whether or not it was funded, allowing us to reconstruct individual 
application histories and their repeated attempts to obtain funding. 
Our second dataset (D,) traces start-up investment records from Ven- 
tureXpert’® (58,111 startup companies involving 253,579 innovators, 
1970-2016; Supplementary Information 1.2). Tracing every startup 
in which venture capital firms invested, D, allows us to reconstruct 
individual career histories counting successive ventures in which they 
were involved. Here we follow previous studies in the entrepreneurship 
literature’’, and classify successful ventures as those that achieved 
initial public offering (IPO) or high-value mergers and acquisitions, 
and correspondingly failed attempts as those that failed to obtain 


such an exit within five years after their first investment by venture 
capital firms. Going beyond traditional innovation domains, we col- 
lected our third dataset (D,) from the Global Terrorism Database”’ 
(170,350 terrorist attacks by 3,178 terrorist organizations, 1970-2017; 
Supplementary Information 1.3). For each organization we trace their 
attack histories”"”*, and classify success as fatal attacks that killed at 
least one person, and correspondingly failure as those that failed to 
claim casualties. 


Mechanisms of chance and learning 


Chance and learning” are two primary mechanisms that explain how 
failures may lead to success. If each attempt has a certain likelihood 
of success, the probability that multiple attempts all lead to failure 
decreases exponentially with each trial. The chance model therefore 
emphasizes the role of luck, suggesting that success eventually arises 


'Center for Science of Science and Innovation, Northwestern University, Evanston, IL, USA. Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL, USA. 
8McCormick School of Engineering, Northwestern University, Evanston, IL, USA. “Kellogg School of Management, Northwestern University, Evanston, IL, USA. Department of Sociology, 
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Fig. 1| Mechanisms of chance and learning. a—j, We compare theoretical 
predictions and empirical measurements for performance changes (a-e) as 
wellas the length distribution of failure streaks (f-j). a, f, The chance model 
predicts no performance change (a) witha failure streak length that follows an 
exponential distribution (f). b, g, The learning hypothesis predicts improved 
performance (b) with failure streaks that are shorter than expected by the 
chance model, corresponding to a faster-than-exponential distribution (g). Both 
hypotheses are contested by empirical patterns observed across the three 
datasets. To ensure that performance metrics are comparable across data and 
models, we standardized performance measures according to their underlying 
distribution (Supplementary Information 5.1). c—e, We find that failures in real 
data are associated with improved performance between the first and 


from an accumulation of independenttrials. To test this, we compared 
the performance of the first and penultimate attempt within failure 
streaks (Supplementary Information 5.1), measured by NIH percentile 
score for a grant application (D,), investment size by venture capital 
firmstoacompany (D,) and number of wounded individuals by an attack 
(D;). We find that across all three datasets, the penultimate attempt 
shows systematically better performance than the initial attempt 
(Fig. lc-e). These results reject that success is simply driven by chance 
(Fig. 1a) but lend support to the learning mechanism (Fig. 1b), which sug- 
gests that failure may teach valuable lessons that are difficult to learn 
otherwise”"®. Assuch, learning reduces the number of failures required 
to achieve success, and predicts that failure streaks should followa 
narrower length distribution (Fig. 1g) than the exponential distribu- 
tion predicted by chance (Fig. 1f). However, across all three domains, 
the length of failure streaks follows a fat-tailed distribution (Fig. 1h-j, 
Supplementary Information 5.2), indicating that despite improvements 
in performance, failures are characterized by longer-than-expected 
streaks before the onset of success. Together, these observations dem- 
onstrate that neither chance nor learning alone can explainthe empirical 
patterns that underlie failures, suggesting that more complex dynamics 
may be at work. 


Modelling dynamics of failure 


Here we explore the interplay between chance and learning by develop- 
ing a simple one-parameter model that mimics how future attempts 
build on previous failures (Fig. 2a, b, Supplementary Information 3.1). We 
consider that each attempt consists of many independent, unweighted 
components, with each component ibeing characterized by an evalua- 
tion score x‘ (Fig. 2a). For example, components for the submission of 
an NIH proposal include constructing a biosketch, assembling a budget, 
writing a data management plan, adding preliminary data and outlining 
broader impacts. We also note that granting agencies often provide 
rubrics to grade proposals on specific components. 

To formulate a new attempt, one goes through each component, and 
decides to either create a new version (with probability p) or reuse the 
best version x* among the previous k attempts (with probability 1- p) 
(Fig. 2b). A new version is assigned a score drawn randomly froma 
uniform distribution U[O, 1], approximating the percentile of score 
distributions real systems follow. The decision to create a new version 


Number of failures 
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penultimate attempt. Two-sided Welch’s t-test; data are mean+s.e.m. 
c,n=4,872 (first), 5,966 (penultimate). d, n= 579 (first), 548 (penultimate). 
e,n=231 (first), 230 (penultimate). h-j, At the same time, however, failure 
streaks are characterized by a fat-tailed length distribution, indicating that 
failure streaks in real data are longer than expected by chance. For clarity, here 
we showresults for failure streaks for which the lengthis less than 21 
(Supplementary Information 5.2). We further construct a randomized 
sequence of successes and failures by assigning each attempt to agents at 
random (Supplementary Information 5.2). We find that failure streak lengthin 
the randomized sequence follows an exponential-like distribution, showing 
clear deviations from the data. 


is often not random, but driven by the quality of previous versions. 
Indeed, given the best version x*, 1—x* captures the potential toimprove 
it’. The higher this potential, the more likely one may create a new 
version, prompting us to consider a simple relationship, p = (1-x*)*, 
with a> 0 (Methods, Supplementary Information 3.6). Creating anew 
version takes one unit of time with no certainty that its score will be 
higher or lower than the previous one. By contrast, reusing the best 
version from the past saves time, and allows the component to retain 
its best score x*. 

Here we explore a single parameter k for our model, measuring the 
number of previous attempts one considers when formulating a new 
one (Fig. 2b). Mathematically the dynamical process can be described 
as: with probability p, x, ~ U[O, 1] or x, =x%, otherwise (with probability 
1-p) where x= max {%,_,, +++, X;_1}. We quantify the dynamics of the 
model by calculating the quality of the nth attempt, (x,,), which measures 
the average score of all components, and the efficiency after that 
attempt, (¢,), which captures the expected proportion of components 
updated in new versions. Let us first consider the two extreme cases. 
In the first case, K=O means that each attempt is independent from 
previous attempts (Supplementary Information 3.2). Here our model 
recovers the chance model, predicting that as n increases, both (x,,) and 
(t,) remain constant (Extended Data Fig. 1a, d). That is, without consid- 
ering past experience, failure does not lead to quality improvement. 
Nor is it more efficient to try again. 

The other extreme (k > ~) considers all past attempts. The model 
predicts atemporal scaling in failure dynamics (Supplementary Infor- 
mation 3.3). That is, the time it takes to formulate a new attempt decays 
with n, asymptotically following a power law (Extended Data Fig. le): 


Th = (ta)/ (a) -n” (1) 


where y= y.. = a@/(a@ + 1) falls between O and 1 and ‘~’ indicates ‘asympo- 
totically proportional to’. Besides increased efficiency, new attempts 
also improve in quality, as the average potential for improvement decays 
according to (1-x,) ~n™* where n.. = min{y.., 1— y..} (Extended Data 
Fig. 1b). Here the model recovers the canonical result from the learning 
literature??? >, commonly knownas Wright's law”’. This is because, 
as experience accumulates, high-quality versions are preferentially 
retained, whereas their lower-quality counterparts are more likely to 
receive updates. As fresh attempts improve in quality (Extended Data 
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Fig.2|The kmodel.a, We treat each attempt as a combination of many 
independent components (c’). For attempt/, each componentiis characterized 
by an evaluation score x}, which falls between O and 1. Thescore for anew 
version is often unknown until attempted, hence anew versionis assigneda 
score, drawn randomly from the range 0-1. b, To formulate anew attempt, one 
can either create anew version (with probability p, green arrow) or reuse an 
existing version by choosing the best one among past versions x* (with 
probability 1—p, red arrow). P(x>x*) =1-x* captures the potential toimprove on 
prior versions, prompting us to assume p = (1—x*)" where a> O characterizes the 
propensity of an agent to create new versions given the quality of existing ones. 
c, The analytical solution of the model reveals that the system is separated into 
three regimes by two critical points k* and k* +1. The solid line shows the 
extended solution space of our analytical results. d-i, Simulation results from 
the model (a= 0.6) for quality (d-f) and efficiency (g-i) trajectories for different 
k parameters, showing distinct dynamical behaviour in different regimes. All 
results are based on simulations averaged over 10* times.j, k, A phase transition 
around k* predicts the coexistence of two groups that fallin the stagnation and 
progression regimes, respectively. 


Fig. 1b), they reduce the need to start anew, thus increasing the efficiency 
of future attempts (Extended Data Fig. le). 

These two limiting cases (Extended Data Fig. Ic, f) might lead oneto 
suspect a gradual emergence of scaling behaviour as we learn from more 
failures. By contrast, as we increase parameter k, the scaling exponent 
y follows a discontinuous pattern (Fig. 2c, Supplementary Informa- 
tion 3.4) and only varies within a narrow interval of |k* | <k<[k*] +1 
where k* = 1/a. Indeed, when k is small (k < k*), the system converges 
back to the same asymptotic behaviour as k = 0 (Fig. 2c, d, g). Inthis 
region, kis not large enough to retain a good version once it appears. 
As aresult, while performance might improve slightly in the first few 
attempts, it quickly saturates. In this region, agents reject previous 
attempts and flail around for new versions, not processing enough 
feedback to initiate a pattern of intelligent improvement, prompting 
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us to call it the stagnation region. Once k passes the critical threshold 
k*, however, scaling behaviour emerges (Fig. 2c, e, h), indicating that 
the system enters a region of progression, in which failures lead to con- 
tinuous improvement in both quality and efficiency. Nevertheless, with 
asingle additional experience considered, the system quickly hits the 
second critical point k*+1, beyond whichthescaling exponent ybecomes 
independent of k (Fig. 2c, f, i). This means that once[k*] +1number of 
previous failures is considered, the system is characterized by the same 
dynamical behaviour as k > ~, indicating that [k*] +1 attempts are 
sufficient to recover the same rate of improvement as considering every 
failure from the past. 

Importantly, the two critical points in our model can be mapped 
to phase transitions within a canonical ensemble consisting of three 
energy levels (Extended Data Fig. 1lg-j, Methods, Supplementary 
Information 3.5). Phase transitions indicate that small variations at the 
microscopic level may lead to fundamentally different macroscopic 
behaviours. For example, two individuals near the critical point may 
initially appear identical in their learning strategy or other character- 
istics, but depending on which region they inhabit, their outcomes fol- 
lowing failures could differ considerably (Fig. 2j, k). Inthe progression 
region (k>k*), agents exploit rapid refinements to improve through past 
feedback. By contrast, those in the stagnation region (k< k*) donotseem 
to profit from failure, as their efforts stall in efficiency and saturate in 
quality. As such, the phase transitions uncovered in our simple model 
make four distinct predictions, which we nowtest directly inthe contexts 
of science, entrepreneurship and security. 


Testing model predictions 

Notall failures lead to success 

Although we tend to focus on examples that eventually succeeded 
following failures, the stagnation region predicts that there exists a 
non-negligible fraction of cases that do not succeed following failures. 
We measure the number of failed cases that did not achieve eventual 
success in our three datasets, finding not only that members of the 
unsuccessful group exist, but also that the size of the unsuccessful group 
is of asimilar order of magnitude as the successful group (Fig. 3a-c). 
Notably, the number of consecutive failures before the last attempt 
for the unsuccessful group follows a statistically similar distribution 
from those that lead to success (Fig. 3a—c), suggesting that people who 
ultimately succeeded did not try more or less than their unsuccessful 
counterparts. 


Early signals for ultimate success or failure 

Our model predicts that the successful group is characterized by power- 
lawtemporal scaling (Eq. (1)), whichis absent for the unsuccessful group 
(Fig. 2j), predicting that the two groups may follow fundamentally 
different failure dynamics that are distinguishable at an early stage. To 
test this prediction, we measure the average inter-event time between 
two failures 7,, as a function of the number of failures (Supplementary 
Information 5.3). Figure 3d-f shows three important observations. First, 
for the successful group, 7, decays with n across all three domains, 
approximately following a power law, as captured by Eq. (1) (Extended 
Data Fig. 2, Supplementary Information 5.3, Supplementary Table 4). 
The scaling exponents are within a similar range as those reported in 
learning curves”, further supporting the validity of power-law scaling. 
Although the three datasets are among the largest in their respective 
domains, agents with a large number of failures are exceedingly rare, 
limiting the range of n that can be measured empirically. We therefore 
test whether alternative functions may offer a better fit, finding a power 
law to be the consistently preferred choice (Supplementary Informa- 
tion 6.2). Second, we found that temporal scaling disappears when 
we measure the same quantity for the unsuccessful group (Fig. 3d-f), 
consistent with predictions about the stagnation region. Third, the 
two groups show distinguishable failure dynamics as early as n= 2, 
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(CDF) of the number of consecutive failures before the last attempt for 
successful and unsuccessful groups. To eliminate the possibility that agents 
were simply in the process of formulating their next attempt, we focus on cases 
for which it has been at least five years since their last failure. Ineach of our 
three datasets, the two distributions are statistically indistinguishable 
(Kolmogorov-Smirnov test for samples with at least one failures). For clarity, 
here we show results for less than 21 failures (Supplementary Information 5.2). 
Inset, the sample size of successful and unsuccessful groups, showing their size 
is of asimilar order of magnitude. d-f, Early temporal signals separate 
successful and unsuccessful groups. d, n= 43,705 (successful), 15,132 
(unsuccessful). e, n=2,455 (successful), 16,656 (unsuccessful). f,n=446 
(successful), 321 (unsuccessful). For each group, we measure the average inter- 
event time between two failures 7,,=¢,/t, as a function of the number of 
attempts. Dots and shaded areas are mean + s.e.m. measured from data 
(Supplementary Information 5.3). All successful groups manifest power-law 
scaling T, ~n’ (Extended Data Fig. 2). The two groups show distinguishable 
temporal dynamics for n=2. Two-sided Welch’s t-test; P=3.02 x 10, 7.18 x 10°, 
9.42 x10 for comparisons of successful and unsuccessful groups in 

d,e,f respectively. This temporal scaling is absent for unsuccessful groups. 
g-i, Performanceat first attempt appears indistinguishable between 
successful and unsuccessful groups that experienced a large number of 
consecutive failures before the last attempt (at least 5 for D,, 3 for D, and 2 for D;, 
two-sided Welch’s t-test), but becomes distinguishable at the second attempt 
(two-sided Welch’s t-test). Whereas performance improves for the successful 
group (one-sided Welch’s t-test), this improvement is absent for the 
unsuccessful group (one-sided Welch’s f-test). Data are mean+s.e.m.g,n=628, 
145, 571,123 (from left to right). h, n= 248, 1,332, 237, 1,312 (from left to right). 
i,n=231,173, 229,174 (from left to right). 


suggesting noteworthy early signals that separate those who eventu- 
ally succeed from those who do not. 

Observations uncovered in Fig. 3d-f are notable for two main rea- 
sons. First, failures captured by the three datasets differ widely in their 
scope, scale, definition and temporal resolution, yet despite these 
differences, they are characterized by remarkably similar dynamical 
patterns predicted by our simple model. Second, although one might 
expect that the last attempt was crucial in separating the two groups, 
as the model predicts, successful and unsuccessful groups each follow 
their respective, highly predictable patterns, which are distinguishable 
long before the eventual outcome becomes apparent. Indeed, we use 
D, to set up a prediction task (Extended Data Fig. 3, Methods, Supple- 
mentary Information 6.1) to predict ultimate success or failure using 


only temporal features, which yielded substantial predictive power. 
To test whether the observed patterns in Fig. 3d—-f may simply reflect 
preexisting population differences, we take agents who experienced 
a large number of failures, and measure performance from their first 
attempt. We find that for all three domains, the two populations were 
statistically indistinguishable in their initial performance (Fig. 3g-i), 
which leads us to the next prediction. 


Diverging patterns of performance improvement 

Although the two groups may have begun with similar performances, 
the model predicts that they may experience different performance 
gains through failures (Fig. 2k). We compared performanceat first and 
second attempts, finding significant improvement for the successful 
group (Fig. 3g-i), whichis absent for the unsuccessful group. We further 
repeated our measurements by comparing the first and penultimate 
or halfway attempt, arriving at the same conclusion (Extended Data 
Fig. 9j-0, Supplementary Information 7.3). This prediction explains the 
patterns that were observed in Fig. Ic—e, which leads us to the second 
puzzle described in Fig. 1h-j: if performance improves, why are failure 
streaks longer than we expect? 


Failure streaks follow a Weibull distribution 

One key difference between progression and stagnation regimes is 
the propensity to reuse past components. From the perspective of 
exploration versus exploitation”””’, however, reuse helps one to retain 
a good version when it appears, but it could also keep one ina subop- 
timal position for longer, leading to our final prediction: the length of 
failure streaks follows a Weibull distribution (Supplementary Table 4): 


P(N=n)-e 9? (2) 


Moreover, the shape parameter f is connected with the tempo- 
ral scaling exponent y through a scaling identity (Supplementary 
Information 3.8) 


Bty=1 (3) 


This means that if we fit the streak length distribution in Fig. 1h-j 
to obtain the shape parameter £, it should relate to the temporal scal- 
ing exponent y, which is obtained from Fig. 3d-f. Comparing £ and 
y measured independently across all three datasets shows consist- 
ency between our data and the scaling identity Eq. (3) (Supplementary 
Table 4). 

We test the robustness of our results along several dimensions, arriv- 
ingat broadly consistent conclusions (Methods, Extended Data Figs. 5-9, 
Supplementary Information 7). We include further quantitative tests 
for model assumptions and additional interpretations of the modelin 
the Methods. 


Discussion 

Asasingle parameter, k necessarily combines individual, organizational 
and environmental factors in learning” (Supplementary Informa- 
tion 3.1). The one-parameter model developed here represents a mini- 
mal model (Supplementary Information 3.7), which can be extended 
into more complex frameworks. For example, agents may have varied 
incentives to improve or may differ in their confidence and ability to 
judge their previous work. Such factors trace heterogeneity in the 
population and can be captured by the a parameter, which quantifies 
the propensity of individuals to change given feedback. This led us to 
develop the k-a model (Methods), which predicts a two-dimensional 
phase diagram with three distinct phases (Extended Data Fig. 10a, b, 
Methods, Supplementary Information 4.1). The model can be further 
extended to capture fuzzy inference from past feedback, allowing 
agents to not always choose the best previous versions (see ‘k-a-6 
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model’ in the Methods, Extended Data Fig. 10c, d, Supplementary 
Information 4.2). 

The model also offers relevant insights for the understanding of 
learning curves. For example, the second critical point of the model 
suggests the existence of aminimum number of failures one needs to 
consider (k* + 1), indicating that it is unnecessary to learn from all past 
experiences to achieve a maximal learning rate. This finding poses a 
potential explanation for the widespread nature of Wright’s law across 
a wide variety of domains, particularly given the fact that in many of 
those domains not all past experiences can be considered (Supple- 
mentary Information 2). 

Furthermore, our simple model does not explicitly account for many 
of the complexities that characterize real settings that may affect failure 
dynamics, suchas knowledge depreciation”’, competition, forgetting 
and transfer” or vicarious learning from others’. However, the model 
offers a theoretical basis to incorporate additional factors, including 
individual and organizational characteristics that may affect learn- 
ing” (see Methods for various factors related to learning rate, including 
organizationallearning, previous achievements and gender differences), 
demonstrating that our modelling framework can serve asa springboard 
for anchoring future models and analyses. 


Concluding remarks 


Together, these results support the hypothesis that if future attempts 
systematically build on past failures, the dynamics of repeated failures 
may reveal statistical signatures that are discernible at an early stage. 
Traditionally the main distinction between ultimate success and fail- 
ure following repeated attempts has been attributed to differences 
in luck, learning strategies or individual characteristics, but here our 
model offers animportant explanation with crucial implications: Even 
in the absence of distinguishing initial characteristics, agents may 
still experience fundamentally different outcomes. Indeed, Thomas 
Edison once said, ‘Many of life’s failures are people who did not real- 
ize how close they were to success when they gave up. Our results 
unveil identifiable early signals that help us to predict the eventual 
outcome to which failures lead. Together, they not only deepen our 
understanding of the complex dynamics beneath failure, but also 
hold lessons for individuals and organizations that experience failure 
and the institutions that aim to facilitate or hinder their eventual 
breakthrough. 
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Methods 


Model assumptions 

Parameter kin our model can be viewed as approximating the ‘memory’ 
of past versions. The rationale of using k for the model is rooted inthe 
learning literature showing that the general notion of ‘forgetting’ takes 
multiple forms, often representing a combination of individual, organi- 
zational and environmental factors. Indeed, several relevant factors may 
beat play, which can generate patterns similar to forgetting. For exam- 
ple, in rapidly shifting innovation domains, not all past failures remain 
useful over time and some become obsolete. Consider the possibility 
of knowledge depreciation”, which could also apply in our settings as 
environments (of scientific knowledge, capital markets or security situa- 
tions) evolve over time, such that past experience could become useless 
even if memorized. For example, an NIH proposal four failures ago may 
become irrelevant as the ideas proposed have been proven wrong, or 
published by the principal investigator or another research group”. 
Similarly, startup ideas from the dot-com era may be irrelevant in the 
era of artificial intelligence and Blockchain™. Terrorist tactics canalso 
depreciate over time, as past strategies attracted media coverage and 
gave rise to tighter security measures to defend against them”. This 
line of reasoning supports the intuition that recent attempts are most 
relevant. It is also consistent with the learning literature, which suggests 
knowledge forgetting can happen in distinct ways, either voluntarily 
or involuntarily®. Given these factors, here we select a single parameter 
kto encapsulate a variety of potential contributing factors. 


Quantifying component dynamics 

To empirically measure the dynamics of components, we collected 
abstract information for all RO1 applications submitted after 2008 
(Supplementary Information 5.4). To this data corpus we applied a nat- 
ural-language-processing technique to extract MeSH (medical subject 
headings) terms from each abstract, which approximate the methods, 
physical states and processes involved in the proposed research. This 
allows us to quantify the dynamics of component reuse from previous 
proposals forthe successful group. We measure the new versions of com- 
ponents by the number of new MeSH terms (terms that did not appear 
inthe previous k submissions, defined as m,) and plot M, =(m,,)/(m,) as 
afunction of n. Our model suggests that given k, we can use M, to mimic 
the temporal dynamics of T,. More precisely, for the successful group, 
we expect to observe that for large k (k> k*), M, and T,, are characterized 
by similar dynamics. For small k (k < k*), however, the two quantities 
could be quite different. As shown in Extended Data Fig. 4, our empiri- 
cal analysis shows that the two curves indeed follow different dynamics 
for small k (k <3), but the dynamics of M, and T,, become statistically 
indistinguishable for k>3 (from4 to ~), approximately following a power 
law with y ~ 0.35. We cannot directly examine component dynamics for 
the unsuccessful group due to the lack of sufficient data—by definition 
agents in this group submitted no proposal after 2010, and the unsuc- 
cessful abstract data only go back to 2008. 


Phase transitions 
To understand the nature of two transition points in our model, here 
we consider a canonical ensemble of N particles (NV > ~) and three 
energy states F,(h) = 1, £,(h) = (2h - 1)? and E.(h) = 1 where A denotes 
the external field. We can write down the partition function of the 
system Z= e NEA) + e“NEv() + e-NEc and calculate its free energy density 
f=In{Z/N].1n this system, it can be shown that the magnetization den- 
sity m= ov is discontinuous at the boundary of two energy states 
E,(h) = E,(A) and E,(h) = E.(A), characterized by two phase transitions 
at h=Oandh=1, respectively. 

We notice that the canonical ensemble considered above has a map- 
ping to our model. Indeed, denoting [= k* x y/(1- y) and K=k — k*, 
we can rescale the system as / = min{max{/,(K), [,(K)}, (.(K)} where 


[,(K) = 0, ,(K) = K and F.(K) = 1, allowing us to map the two systems 
through f> (2/'-1)*, N> In[n], h> K and E,(h) = (2F,(K) - 1)? (Extended 
Data Fig. 1g-j). 

To understand the origin of the two transition points, we can calcu- 
late the expected lifespan of a high-quality version, obtaining (u(x)) 
~((1-x) mine (Supplementary Information 3.4). The first critical 
point k* occurs when the first moment (u) diverges. Indeed, when kis 
small (k<k*), (u) is finite, indicating that high-quality versions can only 
be reused for a limited period of time. Once k passes critical point k*, 
however, (u) diverges, offering the possibility for a high-quality ver- 
sion to be retained for an unlimited period of time. The second critical 
point arises due to the competition between two dynamical forces: (1) 
whether the current best version becomes forgotten after k consecutive 
attempts in creating new versions (dominated by the k/k* term); or (2) it 
is substituted by an even better version (dominated by the 1/k*+1term). 

Note that while phase transitions carry exceptional importance in 
statistical physics, similar phenomena and concepts are also of fun- 
damental relevance in the social and behavioural science literatures. 
For example, critical thresholds have been observed and modelled 
in social settings that include shifts in the segregation of neighbour- 
hoods”, the formation of social networks” and changes in collective 
opinions*®. In each case, slight shifts in microscale phenomena, such 
as average preference, group size or interaction intensity, conditiona 
qualitative transition in macroscale outcomes. 


Alternative hypothesis, interpretation and robustness checks 
To better understand the role of heterogeneity in learning, we separated 
the successful group into narrow-win and clear-win subgroups based 
on their eventual performance. We find that, despite their eventual 
difference, the temporal dynamics of the two groups remain statisti- 
cally indistinguishable (two-sided Welch’s t-test, P= 0.763 (D,), 0.813 
(D,), 0.259 (D,), Extended Data Fig. 4), suggesting that the distinction 
between successful and unsuccessful groups appears the most critical, 
whereas agents within the successful group are characterized by similar 
dynamics, consistent with the predictions of our model. 

Analternative interpretation for the stalled efficiency of the unsuc- 
cessful group is an effort to hedge against failures—their efficiency did 
notimprove because they spent more effort elsewhere. The three profes- 
sions that we studied, NIH investigators, entrepreneurs and terrorists, 
involve varied levels of risk, exposure and commitment, which renders 
this explanation less likely. 

To test the robustness of our results, we vary the definitions of what 
constitutes the successful group (Supplementary Information 7.1) by 
excluding revisions in D, (Extended Data Fig. 6), changing the thresh- 
old of high-value mergers and acquisitions or controlling for unicorn 
companies in D, (Extended Data Fig. 7), and varying the types of attack 
or changing the threshold for fatal attacks in D, (Extended Data Fig. 8). 
We also vary the definition of unsuccessful groups (Extended Data Fig. 5, 
Supplementary Information 7.2) and test other measures to approxi- 
mate performance (Extended Data Fig. 9j-o0, Supplementary Informa- 
tion 7.4, 7.5). We further adjust for temporal variation by controlling for 
the overall success rate across different years (Extended Data Fig. 9a-i, 
Supplementary Information 7.3). Across all variations, our conclusions 
remain the same. 


Predicting ultimate success 

We use a simple logistic model to predict whether one may achieve 
success following N previously failed attempts in D,, using only tem- 
poral features t, (1<n<N-1) as predictors. To evaluate prediction 
accuracy, we calculate the area under the receiver operating characteris- 
tic (AUC) curve with tenfold cross-validation. We find that, by observing 
the timing of the first three failures alone, our simple temporal fea- 
ture yields high accuracy in predicting the eventual outcome with an 
AUC close to 0.7, which is significantly higher than random guessing 
(Mann-Whitney U-test, P< 10°; Extended Data Fig. 3a, Supplementary 
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Information 6.1). We repeated the same prediction task on D, and D,, 
arriving at similar conclusions (Extended Data Fig. 3b, c, Supplementary 
Information 6.1). The predictive power from temporal features alone 
is somewhat unexpected. Indeed, there are a large number of docu- 
mented factors that affect the outcome of a grant application” “, rang- 
ing from the previous success rate to publication and citation records 
tothe race, ethnicity and gender of the applicant. Here we ignore these 
factors, however, using only features that pertain to temporal scaling 
as prescribed by our model. This suggests that our predictive power 
represents a lower bound, which could be further improved and lever- 
aged by incorporating additional factors. 


k-amodel 

Agents may differ in the judgment of their own work or incentives to 
change given feedback, which can be captured by varying the a param- 
eter inthe original k model. Of the many influences on p, one key factor 
isthe quality of existing versions, suggesting that p should bea function 
of x*. Consider the following two extreme cases. If x* > O, existing ver- 
sions of this component have one of the worst scores and, hence, ahigh 
potential for improvement when replaced with a new version. In this 
case, the likelihood of creating a new version is high, that is, p>1.On 
the other hand, x* > 1corresponds to a near-perfect version, yielding a 
decreased incentive to create anew one (p> 0). Indeed, P(x >x*) =1-x* 
captures the potential to improve on previous versions, prompting us 
to assume that p=(1-—x*)* where a> O characterizes the propensity of an 
agent to create new versions given the quality of existing ones. There- 
fore, a> Oindicates that regardless of one’s evaluation, the agent will 
always create anew version, whereas @ > ~ points to the other extreme 
where one does not create a new version unless it is extremely bad 
(Extended Data Fig. 10a). Considering a another tunable parameter, 
we arrive at a two-parameter model: the k-a model (Supplementary 
Information 4.1). 

To solve this model we can substitute k* with 1/a, and the indexes k/k* 
and 1/k*+ Inowbecome ka and a+1. The extended model thus predicts 
the existence of three different phases on a two-dimensional phase 
diagram, with boundaries ka=1and (k- 1)a=1 that separate the three 
phases (Extended Data Fig. 10b). The k-a model reduces back to the 
two critical points in the original kmodel when we fix a. The two param- 
eters jointly define an effective K =k— k*=k-1/a. The critical bounda- 
ries therefore reduce into two simple equations: K = 0 and K=1. Note 
that the assumed relationship between p and (1—x*) is not limitedtoa 
power law but can be relaxed into its asymptote form. Indeed, we show 
that as long as the function satisfies Tes > aasx*>1, the model offers 
the same predictions” (Extended Data Fig. 3, Supplementary Informa- 
tion 3.6). 


k-a-6 model 

Agents may have fuzzy or unclear inference regarding past feedback, 
and may therefore not always choose the version with highest quality. 
Wecan model the choice between different versions ina probabilistic 
fashion, by introducing a6 parameter to the k-a model. Here the prob- 
ability to choose the ith version as a baseline follows 


ee s 
PU) = FX) "n-ksien-a 


where Zis the normalization factor, Z=¥",_,(1-x,) ° and k>1.6=0 
means one cannot differentiate between the quality of past versions 
and selects randomly among different versions, whereas 6 > ~ indicates 
that one always chooses the previous version with highest quality, con- 
verging back to our original k model or the k-a model. Incorporating 6 
leads to the k-a-6 model (Supplementary Information 4.2). 

Analytically solving the model reveals interesting scaling behaviours 
based on 6 (Supplementary Information 4.2). Indeed, we find the scaling 
behaviour of the system follows 


y(k, a, 6) =1- {max[min(a + (k-1)min{l, a, 6},a+1), 1;" 


with rich mathematical properties. When 6> ~, the new solutions con- 
verge back to the original solution for the kK-a model. With 6, the three- 
parameter model is characterized by four different phases. Three of 
the regimes are generalizations of those found in the k-a@ model, where 
the scaling exponent y does not depend on 6in the limit of 6 > ~, that 
is, y(k, a, 6) = y(k, a, ~). The fourth one, however, is anew phase and 
only exists for small 6. The intuition is that in this regime the inability to 
select a high-quality version (small 6) dominates the scaling behaviour, 
with exponent p(k, a, 6) =1-[(k-1)6+ aJ". Together, these extensions 
offer further support for the predictions of our original model, while 
demonstrating the theoretical potential of the model by enriching its 
mathematical properties for more realistic interpretations. They also 
point to promising future research that explores the interplay between 
different perspectives on learning. 

Note that although all three variations of the model predict the exist- 
ence of different phases, the primary focus of this paper concerns the 
fundamental differences in the nature of these regimes (that is, stagna- 
tion versus progression), rather than the behaviour of the system as it 
approaches the critical threshold. As such, the conclusions of the paper 
hold the same regardless of any specific critical behaviour around the 
threshold. 


Factors related to learning rate 

Our model offers a framework to anchor potential factors relevant to 
learning***’. As an example, here we test three different factors. First, 
the literature has identified several factors for the emergence of learning 
at the level of organizations”, suggesting that individual learning is 
just one factor in how and why organizations learn. This suggests that 
settings closer to organizational learning (such as terrorist groups) 
should correspondingly experience higher learning rates than those 
closer to individual learning (suchas NIH principal investigators) (Sup- 
plementary Information 5.5). We test this hypothesis by calculating the 
average scaling exponent y measured from our data (Supplementary 
Table 4), and find that our estimations support this hypothesis; learning 
rates are lowest for individual researchers, higher for entrepreneurs 
and their founding teams and highest for terrorist organizations. Note 
that although these results show consistency with theories from the 
organizational learning literature, these differences could also be due 
to inherent domain-specific differences. 

Second, higher previous achievements often bring recognition and 
resources, aphenomenareferred toas the Mattheweffect*, which might 
translate into higher learning rates. To test this we link NIH grant applica- 
tion data to the Web of Science citation database througha systematic 
effort to disambiguate authors, and match the citations of previous 
research papers with submitted proposals>’ (Supplementary Informa- 
tion 5.6). We take principal investigators who failed more than three 
times before their eventual success and calculate the total number of 
citations from all his/her papers including only papers published before 
the first failure. We find that prior acclaim is positively and significantly 
correlated with learning rate y (P< 0.001). 

Third, persistent gender inequalities in science and entrepreneur- 
ship** °° suggest the possibility that failure dynamics may be mediated 
by gender”. Our regression analysis reveals a significant correlation 
between gender and learning rate (Supplementary Information 5.7). All 
else being equal, the learning rate y of a male principal investigator in 
the NIH system exceeds that ofa female principal investigator by 0.14 
(P=0.001), suggesting that male principal investigators fail faster than 
their female colleagues. This difference appears substantial, consider- 
ing that the average learning rate is centred around 0.35. We further 
test this relationship in the startup dataset, finding a similar gap of 0.10 
between male and female innovators, but this result is not significant, 
possibly owing to asmaller sample size. Note that these gender differ- 
ences probably flow from institutional as well as individual causes, such 


as a culture that discourages women from persistence and encourages 
oversensitivity to feedback. Indeed, one irony suggested by our model 
is that agents in the stagnation region did not work less. Rather they 
made more, albeit unnecessary modifications to what were otherwise 
advantageous experiences. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this article. 


Data availability 
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Extended Data Fig. 1| The kmodel. a-f, Simulation results from the model 
(a=0.6) for the cases of k=0 (a, d) and k> ~ (b, e) interms of the average quality 
(a-c) and efficiency (d-f) of each attempt. k= 0 recovers the chance model, 
predicting aconstant quality (c) and efficiency (f). k >» predicts temporal 
scaling that characterizes the dynamics of failure (e) with improved quality (b), 
recovering predictions from learning curves and Wright’s law. g-j, Illustration of 
mapping between failure dynamics (g, h) and canonical ensembles (i,j). The 


canonical system is characterized by three different states a, b, cwith 
corresponding energy densities £,(A), £,(A), E.(h). Here we assume 

E,(h) = (2eh- 1)", E,(A) = (2h-1)? and E,(A) =[2e(1- h) - 1]? wheree> 0*. The 
introduction of cis to distinguish state a from state c, both of whichcan be 
approximated in the limiting condition £,(h) = £,(h) = 0. We mapf> (2F- 1)”, 
N>In[n],h>Kand Eh) =[2/,(K)-1). Inthis case, the twotransition points k* and 
k*+1correspond to A=0and1inthe canonical ensemble systems. 
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Extended Data Fig. 2 | Predicting temporal dynamics in science, 
entrepreneurship and security. a~c, We compare the goodness of fit for three 
different models in temporal dynamics in NIH grants (a,n=10345), startups (b, 
n=275) and terrorist attacks (c, n=136). For each individual sample, we take all 
but the last inter-event time for model fitting (n=1, ..., N-1), comparing model 


predictions for the last inter-event time. The tested functional forms are power 
law, t, = an”; exponential, t, = ab; and linear, t, =a + bn. We then calculate the 
frequency that each model reaches minimum error, defined as| log(ty) — log(éy)L 
amongall three forms. The power-law model offers consistently better 
predictions. d-f, As ina-c, but using |ty — fylas the loss function. 
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regression models (Supplementary Information6.1)topredictultimatesuccess investigators. 
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Extended Data Fig. 4 | Model validations. a, b, Anillustration of the 
component dynamics. We extract all MeSH terms associated with the nth 
attempt, S,, and calculate the number of newterms m,, defined as 

ISp— (Sp U +: US,-,)|- b, Testing component dynamics in NIH grant 
applications. We calculate the dynamics of M, = (m,,)/(m,) using different k and 
compareit with 7, The centres and error bars of M, show the mean +s.e.m. 
(n=5,899) for different k. The shaded area shows mean +s.e.m. of 7, (log scale) 
measured onthe same subset. All k>3 lead to similar trends between M,, and T,,. 
c-e, Length of failure streak after randomization in science (c), 


entrepreneurship (d) and security (e). We take the samples used in Fig. land 
shuffle the success/failure label from each attempt. This operation keeps both 
the overall success rate and the total number of attempts for each individual 
constant. f-h, Temporal scaling patterns within the successful group in science 
(f), entrepreneurship (g) and security (h). We separated the successful group 
into two subgroups (narrow winners and clear winners) based on eventual 
performance (0.9 in evaluation score for D,, 0.5 ininvestment amount for D, 
and lin wounded individuals for D;). The shaded area shows mean +s.e.m. of 7, 
(log scale). 
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Extended Data Fig. 5| See next page for caption. 


Extended Data Fig. 5 | Robustness check on definition of unsuccessful 
group.a-I, Robustness check as we change the threshold ofinactivity to 3 years. 
a-c, Failure streak in science (a), entrepreneurship (b) and security (c). Blue 
circles represent real data fromthe successful group and dashed lines represent 
fitted Weibull distributions. d-f, Temporal scaling patterns in science (d), 
entrepreneurship (e) and security (f). The shaded area shows mean + s.e.m. of 7, 
(log scale). g-i, Performance dynamics in science (g, n= 641, 231, 578,190, 

from left to right), entrepreneurship (h, n= 248, 1,332, 237, 1,312 from left to 
right) and security (i, n=238,198, 236,199, from left to right). The successful 
and unsuccessful groups that experienced a large number of consecutive 
failures before the last attempt (at least 5 for D,, 3 for D, and 2 for D;) appear 
indistinguishable for first failures (two-sided Welch’s t-test; P= 0.566, 0.671and 
0.349), but quickly diverge for second failures (two-sided Welch’s t-test; 


P=2.09 x10, 4.95 x 10° and 7.77 x10). The successful group also shows 
significant improvement in performance (one-sided Welch’s f-test; 

P=7.03 x 107, 2.37 x10? and 2.32 x 10”), which is absent for the unsuccessful 
group (one-sided Welch’s t-test; P= 0.717, 0.176 and 0.786). Dataare 
meant+s.e.m.j-I, AUC score of predicting ultimate success in science (j), 
entrepreneurship (k) and security (I). The centres and error bars of AUC scores 
denote the mean+s.e.m calculated from tenfold cross-validation over 50 
randomized iterations. m-x, Asina-I but using 7 years as the threshold of 
inactivity. Sample sizes are s:n = 620,101, 559, 76; t:n =248, 977, 237, 989; 

u: n= 216,152, 214,153. Pvalues in s—u (from bottom to top) are P= 0.883 (s), 
0.671 (t), 0.456 (u); P=2.25 x10 (s), 1.38 x 10 3 (t), 8.34 x10 (u); P=4.59 x10? 
(s), 2.37 x10? (t), 3.33 x 10? (u); P= 0.838 (Ss), 0.446 (t), 0.775 (u). *P< 0.1, 
**P<0.05,***P< 0.01, NS, not significant (P>0.1). 
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Extended Data Fig. 6 | See next page for caption. 
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Extended Data Fig. 6| Robustness check on D,. a-c, Failure streak as we change 
the score threshold to 55 (a), exclude revisions as successes (b) and only focus on 
new principal investigators without previous RO1 grants (c). Blue circles 
represent real data from successful groups and dashed lines represent fitted 
Weibull distributions. d-f, Temporal scaling patterns as we change the score 
threshold to 55 (d), exclude revisions as successes (e) and only focus on new 
principal investigators without previous RO1 grants (f). The shaded area shows 
mean+s.e.m. of 7, (log scale). g-i, Performance dynamics as we change thescore 
threshold to 55 (g, n= 768, 189, 686,170, from left to right), exclude revisions as 
successes (h, n= 252, 145,216,123, from left to right) and only focus onnew 
principal investigators without previous RO1 grants (i, n=1,164, 308, 1,530, 334, 
from left to right). The successful and unsuccessful groups that experienced a 


large number of consecutive failures before their last attempt (at least 5 for g and 
h, and 3 fori) appear indistinguishable for first failures (two-sided Welch’s t-test; 
P=0.242, 0.819, 0.289) but quickly diverge for second failures (two-sided 
Welch’s t-test; P=3.40 x10*, 3.40 x 107, 9.70 x 10°’). The successful group also 
shows a significant improvement in performance (one-sided Welch’s t-test; 
P=4,23x107, 3.04 x10, 1.92 x10“), which is absent for the unsuccessful group 
(one-sided Welch’s t-test; P= 0.863, 0.754, 0.997). Dataare mean+s.e.m.j-I, AUC 
score of predicting ultimate success as we change the score threshold to 55 (j), 
exclude revisions as successes (k) and only focus on new principal investigators 
without previous RO1 grants (I). The centres and error bars of AUC scores denote 
the mean+s.e.m calculated from tenfold cross-validation over 50 randomized 
iterations. *P<0.1,**P<0.05,***P< 0.01, NS, P>0.1. 
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Extended Data Fig. 7 | See next page for caption. 
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Extended Data Fig. 7| Robustness check onD,. a-c, Failure streak as we change 
the threshold of high-value mergers and acquisitions (M&A) to 5% (a), exclude 
M&Asas successes (b) and classify unicorns as successes (c). Blue circles 
represent real data from successful groups and dashed lines represent fitted 
Weibull distributions. d-f, Temporal scaling patterns as we change the threshold 
of high-value M&A to 5% (d), exclude M&As as successes (e) andinclude unicorns 
as successes (f). The shaded area shows mean +s.e.m. of 7, (log scale). g-i, 
Performance dynamics as we change the threshold of high-value M&A to 5% 
(g,n=251,1,304, 243, 1,284, from left to right), exclude M&As as successes 

(h, n=248, 1,335, 237, 1,315, from left to right) and include unicorns as successes 
(i, n=257,1,330, 244, 1,311, from left to right). The successful and unsuccessful 
groups that experienced a large number of consecutive failures before their last 


attempt (at least 3) appear indistinguishable for first failures (two-sided Welch’s 
t-test; P= 0.937, 0.647, 0.620) but quickly diverge for second failures (two-sided 
Welch’s t-test; P=9.92 x 10°, 4.94 x 10°, 6.33 x 10°). The successful group also 
shows a significant improvement in performance (one-sided Welch’s t-test; 
P=2.16 x 107, 2.37 x10, 2.77 x10”), which is absent for the unsuccessful group 
(one-sided Welch’s t-test; P= 0.224, 0.158, 0.167). Dataare mean +s.e.m.j-l, AUC 
score for predicting ultimate success as we change threshold of high-value M&A 
to 5% (j), exclude M&As as successes (k) and include unicorns as successes (I). 
The centres and error bars of AUC scores denote the mean+s.e.m calculated 
from tenfold cross-validation over 50 randomized iterations. *P<0.1, **P<0.05, 
***P<0.01,NS,P20.1. 
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Extended Data Fig. 8 | Robustness check onD,,. a—c, Failure streak as we focus 
onall samples (a), samples of human-targeted attacks (b) and include vague 
data on fatalities (c). Blue circles represent real data from successful groups 
and dashed lines represent fitted Weibull distributions. d-f, Temporal scaling 
patterns as we focus onall samples (d), samples of human-targeted attacks (e) 
and include vague data on fatalities (f). The shaded area shows mean ¢+s.e.m. of 
T, (log scale). g-i, Performance dynamics as we focus onall samples (g,n=231, 
231, 229, 232, from left to right), samples of human-targeted attacks (h,n=176, 
173,173,174, from left to right) and include vague data on fatalities (i, n =227, 147, 
225,148, from left to right). The successful and unsuccessful groups that 
experienced alarge number of consecutive failures before their last attempt 
(at least 2) appear indistinguishable for first failures (two-sided Welch’s f-test; 


Including vague records Tuning fatal threshold 


c 


10°30 
10-1 


104 10° 
c 


0 10 20 
Number of failures 


40° 2x10° — 3x10° 4x10° 
n 


10° 


40° 2x10° —3x10° 4x10° 


10° 


° 
cd 


Wounded individuals 
oO 
A 


i=) 
o 
x 
z 


40° 2x10° —3x10° 4x10° 


+ n 


° 
Do 


0. Ww 
First failure Second failure ° 


10° 


40° 2x10°  3x10° 4x10° 


n 


0.5 


2 
Number of failures 


P=0.400, 0.859, 0.395), but quickly diverge for second failures (two-sided 
Welch’s t-test; P=2.08 x 10°, 6.70 x 10°, 3.76 x 10°). The successful group also 
shows a significant improvement in performance (one-sided Welch’s t-test; 
P=2.55 107, 5.65107, 3.77 x10), which is absent for the unsuccessful group 
(one-sided Welch’s t-test; P=0.970, 0.901, 0.967). Dataare mean+s.e.m.j-I, AUC 
score of predicting ultimate success as we focus onall samples (j), samples of 
human-targeted attacks (k) and include vague data on fatalities (I). The centres 
and error bars of AUC scores denote the mean +s.e.m calculated from tenfold 
cross-validation over 50 randomized iterations.m-o, Temporal scaling patterns 
as we change the threshold for the successful group to fatal attacks that killed at 
least 5 (m), 10 (n) and 100 (0) people. *P< 0.1, **P< 0.05, ***P< 0.01, NS, P>0.1. 
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Extended Data Fig. 9 | See next page for caption. 
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Extended Data Fig. 9 | Additional robustness checks. a—i, Robustness check 
as we control for temporal variation. a—c, Failure streak in science (a), 
entrepreneurship (b) and security (c). Blue circles represent real data of 
successful groups and dashed lines represent fitted Weibull distributions. 

d-f, Temporal scaling patterns in science (d), entrepreneurship (e) and security 
(f). The shaded area shows mean +s.e.m. of 7, (log scale). g-i, Performance 
dynamics in science (g,n= 628, 145, 571, 123, from left to right), 
entrepreneurship (h, n= 248, 1,332, 237, 1,312, from left to right) and security 

(i, n=231, 173, 229, 174, from left to right). The successful and unsuccessful 
groups that experienced a large number of consecutive failures before their last 
attempt (at least 5 for D,,3 for D, and 2 for D;) appear indistinguishable for first 
failures (two-sided weighted Welch’s t-test; P= 0.814, 0.728, 0.330) but quickly 
diverge for second failures (two-sided weighted Welch’s t-test; P=1.80 x10”, 
3.10 x10, 4.56 x 107). The successful group also shows significant 
improvement in performance (one-sided weighted Welch’s t-test; P=2.10 x10”, 
1.92 x 107, 4.53 x10’), which is absent for the unsuccessful group (one-sided 
weighted Welch’s t-test; P= 0.755, 0.175, 0.903). Data are mean+s.e.m.j-I, 
Performance dynamics as we compare first and halfway attempts in science (j, 
n= 628, 145,582,111, from left to right), entrepreneurship (k, n=248, 1,332,240, 
1,294, from left to right) and security (1, n=231, 173, 228, 175, from left to right). 
The successful and unsuccessful groups that experienced a large number of 
consecutive failures before their last attempt (at least 5 for D,, 3 for D, and 2 for 
D,) appear indistinguishable for first failures (two-sided Welch’s t-test; P=0.898, 
0.671, 0.289) but diverge for halfway failures (two-sided Welch’s t-test; 
P=2.18x10%,1.34x107, 1.34 x 107). The successful group also shows significant 


improvement in performance (one-sided Welch’s t-test; P=2.35 x10”, 
4.54107, 3.69 x10), whichis absent for the unsuccessful group (one-sided 
Welch’s t-test; P=0.992, 0.252, 0.955). Dataare mean+s.e.m.m_-o, Performance 
dynamics as we compare the first and penultimate attempts in science (m, 

n= 628,145, 896, 87, from left to right), entrepreneurship (n,n = 248, 1,332, 227, 
1,199, from left to right) and security (0, n= 231, 173, 230, 173, from left to right). 
The successful and unsuccessful groups that experienced a large number of 
consecutive failures before the last attempt (at least 5 for D,, 3 for D, and 2 for D;) 
appear indistinguishable for first failures (two-sided Welch’s t-test, P=0.898, 
0.671, 0.289) but diverge for penultimate failures (two-sided Welch’s ¢-test; 
P=8.50x10°,3.12 x10, 1.13 x 10). The successful group also shows a 
significant improvement in performance (one-sided Welch’s f-test; 
P=5.79x10°, 4.30 x 10 7, 1.33 x 10”), which is absent for the unsuccessful 
group (one-sided Welch’s t-test; P= 0.980, 0.138, 0.923). Data are mean+s.e.m. 
p-r, The correlation between length of failure streak and initial performance 
(samples with repeated failures) in science (p,n=12,171), entrepreneurship (q, 
n=2,086) and security (r,n=441). Correlationis weak across all three datasets 
(Pearson correlation; r=—0.051, —0.011, —0.107 for p, q, r, respectively). s—u, 
Length of failure streak still follow fat-tailed distributions conditionalonbottom 
10% initial performance samples in science (s,n=6,339), entrepreneurship (t, 
n=2,438) and security (u,n=1,092). Two-sided Kolmogorov-Smirnov test 
between sample and exponential distributions rejects the hypothesis that the 
two distributions are identical with P< 0.01. *P<0.1, **P< 0.05, ***P< 0.01, NS, 
P>0.1. 
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Extended Data Fig. 10 | Generalization of the kmodel.a, The a parameter given k=1,2,3anda=0.4, 0.8, 1.2. We find that 6 may affect the temporal 
connects the potential to improve (1- x) with the likelihood of creating new scaling parameter when itis small, but has no further effect beyond acertain 
versions p through p= (1—x)*. b, Phase diagram of the k-a model. The two- point 6*=min(a, 1/(k-1)).d, Phase diagram of the kK-a-6 model for k=3, with 


dimensional parameter space is separated into three regimes, with boundaries boundaries at a=6, (k-1)6=1, (k-1)6+ a=1, ka=1and (k-1)a=1, respectively. 
at ka=1and (k-1)a=1.c, Theimpact of 6 parameter onscaling exponent y for 
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Research sample We collected three large-scale datasets from three domains: (1) RO1 grant applications ever submitted to the National Institutes of 
Health (NIH), (776,721 applications by 139,091 investigators from 1985 to 2015); (2) Start-up investment records from VentureXpert 
database (58,111 startup companies involving 253,579 innovators) ; and (3) Terrorist attack data from Global Terrorism Database 
( 70,350 terrorist attacks by 3,178 terrorist organizations from 1970 to 2017). 


Sampling strategy No statistical methods were used to predetermine sample size. 


Data collection This study is based on pre-existing datasets. 

Timing The NIH dataset was collected in 2016. The VentureXpert and GTD datasets were collected in 2017. 

Data exclusions The analysis has no data exclusions. Selection criteria within a dataset are described in the supplementary information. 
Non-participation There are no participants in this study. 

Randomization This is a data driven study, not a randomized experiment. 
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The mammalian cortex is alaminar structure containing many areas and cell types 
that are densely interconnected in complex ways, and for which generalizable 
principles of organization remain mostly unknown. Here we describe a major 
expansion of the Allen Mouse Brain Connectivity Atlas resource’, involving arounda 
thousand new tracer experiments in the cortex and its main satellite structure, the 
thalamus. We used Cre driver lines (mice expressing Cre recombinase) to 
comprehensively and selectively label brain-wide connections by layer and class of 
projection neuron. Through observations of axon termination patterns, we have 
derived a set of generalized anatomical rules to describe corticocortical, 
thalamocortical and corticothalamic projections. We have built a model to assign 
connection patterns between areas as either feedforward or feedback, and generated 
testable predictions of hierarchical positions for individual cortical and thalamic 
areas and for cortical network modules. Our results show that cell-class-specific 
connections are organized ina shallow hierarchy within the mouse corticothalamic 


network. 


Cognitive processes and voluntary control of behaviour originate 
in the cortex. To understand how incoming sensory information is 
processed and integrated with past experiences and current states 
in order to generate appropriate behaviour requires knowledge of 
the anatomical patterns and rules of connectivity between cortical 
areas. Connectomes—complete descriptions of brain wiring’—exist at 
different levels of spatial granularity, from single cells to populations 
of cells and entire areas (micro-, meso-, and macro-scale). Common 
organizational features of macro- and meso-scale cortical connectiv- 
ity have been distilled across data sets’* ’, often using graph theory 
approaches to describe network architecture®. For example, cortical 
areas have unique patterns of connections (‘fingerprints’), connection 
strengths follow alog-normal distribution spanning more than four 
orders of magnitude”, and the organization of cortical areas is modu- 
lar, with distinct modules corresponding to specific functions®” ". 
The concept of hierarchical organization” is important for under- 
standing the cortex, and has inspired the development of neural network 
methods in deep machine learning“. A hierarchy of cortical areas was 
first derived by mapping anatomical patterns of corticocortical (CC) 
connections onto feedforward and feedback directions. In primate, 
feedforward connections were characterized by dense axon termina- 
tions in layer (L)4 of the target area, and feedback connections as dense 
terminals in superficial and deep layers (avoiding L4)’”». Differencesin 


the layers of origin are also associated with feedforward and feedback 
connections””°. It is still unclear whether the concept of a cortical hierar- 
chy, which has been derived largely from sensory systems, canbe applied 
globally across the entire cortex, and how it arises from connections 
made by different classes of neurons. Each cortical region comprises 
distinct types of excitatory neurons that are largely organized by layers, 
but also by long-distance projection patterns: intratelencephalic (IT) 
in L2-L6, pyramidal tract (PT) in LS, and corticothalamic (CT) in L6"”"’. 

Thalamic nuclei make important contributions to cortical func- 
tion. They serve as a ‘relay’ for primary sensory information, and are 
well positioned to influence cortical information processing through 
reciprocal or transthalamicloops””°. Thalamocortical (TC) projection 
neurons are classified into three major classes”: core, intralaminar, 
and matrix. Like CC projections, feedforward and feedback rules have 
been proposed for TC and CT projections. Core projections (to L4) 
are described as ‘driver’ (feedforward) and matrix projections (to L1) 
as ‘modulator’ (feedback)”. For CT connections, input from L6 is con- 
sidered feedback, and from L5 feedforward”. 

We hypothesize that a unifying hierarchical organization across the 
entire cortex and its major input structure, the thalamus, is governed 
by aset of anatomical rules for CC, CT and TC connections. By using 
diverse Cre driver mouse lines to selectively label cells from different 
cortical layers and classes”*”’, we have substantially expanded the 
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Fig. 1| Cortical tracer experiments and network modularity. a, Top-down 
view of the right cortical hemisphere in CCFv3. b, A virtual cortical flat map 
shows all 43 annotated areas. The white dotted line indicates the boundaries of 
whatis visible in a.c, Cortical injection locations plotted onthe flat map. d, Key 
summarizes layer and projection class selectively for 15 mouse lines. The 
colour code is also used inc; experiments in lines not listed are coloured dark 
grey. e, Matrix shows ipsilateral normalized connection densities between 43 
cortical areas. Top left corner: the modularity metric (Q) and Qshufrieg are plotted 
for each y level. Colours to the left of each row indicate community structure at 


Allen Mouse Brain Connectivity Atlas resource (http://connectivity. 
brain-map.org’), adding 1,256 new tracing experiments. Our findings 
follow analyses of projection patterns spanning nearly the entire mouse 
cortex and thalamus, and show how these patterns relate to layer and 
cell class. We test the above hypothesis by building a computational 
hierarchical model using anatomical rules derived from observations 
of axon termination patterns. Our results show that the mouse cortex 
and thalamus form an integrated hierarchical organization. 


Cre drivers for cortical projection mapping 

Our goal for expanding the Allen Mouse Brain Connectivity Atlas! was 
tocreatea map ofallinterareal projections that originate from neurons 
of different cell classes within a given source. Here, we used 50 mouse 
lines (wild-type C57BL/6J mice and 49 Cre driver lines) for cortical projec- 
tion mapping. We injected Cre-dependent enhanced green fluorescent 
protein (EGFP) or synaptophysin-EGFP viral tracers to selectively trace 
axons from Cre* neurons (see Extended Data Fig. la—c for virus com- 
parison). Using our high-throughput imaging and informatics pipeline 
approach, we produced 1,081 cortical tracer experiments suitable for 
analyses (see Methods and Supplementary Tables 1, 2). To visualize 
coverage, we plotted injection locations for all experiments ona cortical 
surface flat map of the 3D Allen Common Coordinate Framework refer- 
ence atlas (CCFV3, Fig. la-c). High-resolution image series, visualization 
tools, and quantification of injection sites and brain-wide targets are 
accessible through our data portal (http://connectivity.brain-map.org). 
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We inspected brain-wide axonal projection patterns to manually 
classify each experiment into one of six layer and projection classes: (1) 
IT PT CT: labelled axons originate from all source layers and terminate 
in all target regions (ipsilateral and contralateral cortex and striatum, 
thalamus, and midbrain, pons and/or medulla); (2) IT PT: labelled axons 
observed in all target regions, but the injection site did not include L6 
neurons; (3) IT: labelled axons restricted to ipsilateral and contralat- 
eral cortex and striatum; (4) PT: labelled axons projected ipsilaterally 
and subcortically; (5) CT: labelled axons project almost exclusively to 
thalamus from L6; and (6) local: no (or few) long-distance axons present 
(Extended Data Fig. 2a, Supplementary Table 1). Manual assignment 
to projection class was consistent with the results of unsupervised 
clustering (Extended Data Fig. 2b-d) and previous characterizations”. 
We also characterized layer selectivity in the source for each Cre line 
onthe basis of injection sites (Extended Data Fig. 2e). 

From these data, we chose a core set of 15 lines that we used to com- 
prehensively map connectivity from different projection neuron classes 
across cortical layers (Fig. 1d), resulting in 849 experiments used for 
subsequent analyses of CC and CT projections. We did not identify a 
suitable Cre line for L6IT”’. 


Corticocortical connectivity modules 

Previous network analyses revealed a modular community structure in 
the mouse brain, including in the isocortex’. To determine whether our 
data set demonstrated a similar network architecture, we constructed 


an ipsilateral cortical connectivity matrix (Fig. le) using a data-driven 
model based on wild-type mice”. We analysed the network structure 
of this matrix using the Louvain algorithm”, which maximizes a modu- 
larity metric (Q) to identify groups of nodes (cortical areas) that are 
most densely connected to each other compared to a randomized 
network. To identify stable modules, we systematically varied the spa- 
tial resolution parameter, y, from O to 2.5, measured Q at each value, 
and compared the results to a shuffled network. The mouse cortex 
was modular (Q> Q.nurea) for every value of y above 0.3. We chose to 
focus on the six modules identified at y = 1.3 (Q= 0.36), where the dif- 
ference between Qand Q.pumeq Was Maximal (0.22 + 0.017, mean+s.d.). 
We named these six modules for the areas assigned to each: prefrontal, 
lateral, somatomotor, visual, medial, and auditory. Even with substan- 
tialcommunity structure, intracortical connections are dense between 
modules (Fig. 1f). The Louvain algorithm parameterizes edge strength 
only, withno constraint for spatial arrangement of nodes, but thereis a 
clear spatial component, in that neighbouring areas usually belong to 
the same module (Fig. 1g). We directly tested the degree to which spatial 
proximity affects modularity by fitting a power-law to the distance 
component of the ipsilateral connectivity matrix, and then analysing 
the resulting residual matrix using the Louvain algorithm (Extended 
Data Fig. 3). Although fewer modules were present after accounting 
for distance, regions within them were generally still anatomically 
adjacent. 


Corticocortical projections by layer or class 


Toinvestigate the contributions of distinct cell classes within each area 
to CC projections, we compiled 43 groups of spatially matched experi- 
ments, each having a ‘complete’ membership roster representing all 
layer classes (L2/3 IT, L4 IT, LSIT PT, LS IT, LS PT and L6 CT) plus a wild- 
type or Emx1IT PT CT data set (note that Cre mouse lines are referred 
to by gene symbol; for example, Emx1 is Emx1-IRES-Cre). Projection 
class was confirmed for each experiment. These 43 anchor groups, 
composed of 364 experiments, represent 25 out of 43 CCFv3 cortical 
areas (Fig. 2a—d, Supplementary Table 4). From any given source, CC 
projections labelled from these Cre lines had similar overall patterns, 
but Rbp4(LSIT PT) consistently appeared to show the most extensive 
projections (Fig. 2e). Intracortical projections were labelled from all 
layers (L2/3-L6). The identification of interareal projections from L4 
was unexpected, given canonical circuit descriptions, but is not with- 
out recent precedent”®". To confirm that IT projections could truly be 
attributed to L4 neurons, we reconstructed the complete dendritic and 
axonal morphology of 25 sparsely labelled neurons following whole- 
brain fluorescence micro-optical sectioning tomography (fMOST) 
imaging”. We identified three classes of L4 neuron using morphological 
criteria®’, and confirmed that many, but not all, individual L4 cells sent 
axons to other cortical areas (Extended Data Fig. 4). 

To make quantitative comparisons across Cre lines, we first manually 
identified true positive and negative connections for each experiment 
inthe anchor groups (43 ipsilateral and 43 contralateral targets in 364 
experiments, giving 31,304 connections checked; Supplementary 
Table 4). We noted when a target contained only fibres of passage, 
and considered it atrue negative. Using automated segmentation and 
registration to CCFv3, we generated a weighted connectivity matrix 
(using normalized projection volume (NPV); see Methods) for each Cre 
line (wild-type and Emx1 were merged; Extended Data Fig. le-g), and 
applied the true positive mask to remove true negative connections 
(Fig. 2f). We selected only one anchor group per cortical region for 
visualization if there was a significant, positive correlation between 
Rbp4 replicates (Spearman r> 0.8), resulting in 27 groups; 25 unique 
areas and two locations in the secondary motor area and supplemental 
somatosensory area. 

Overall, the CC matrices revealed several features of layer- or 
class-specific connectivity in terms of the number and specificity of 


connections. The average ‘out-degree’ (number of targets; Fig. 2g) 
from Rbp4 was larger in both hemispheres than from any other Cre line, 
except for 7/x3 onthe contralateral side. LS PT and L6 CT lines had the 
fewest targets in both hemispheres, followed by the L2/3, L4, and LSIT 
lines. For every line, there were fewer (or no) contralateral compared 
to ipsilateral connections. 

We measured the amount of overlap between the specific set of 
cortical targets contacted in each experiment and its Rbp4 anchor 
(Fig. 2h). Wild-type/Emx1 projections went to about 80% of the same 
targets as Rbp4 axons. A roughly equal number of targets was unique 
to either wild-type or Emx1 (12.7 and 7%, respectively), perhaps owing 
to injection variability or differences in viral tracers. For every other 
Cre line, essentially all projections went to a subset of LS Rbp4 targets 
(Fig. 2h, fewer than 5% of targets unique to any line). WithinLS, IT cells 
had the most overlap with Rbp4 targets, whereas PT cells had the least 
(Fig. 2e, f). Most of the differences between L2/3 and LS resulted from 
the presence of fewer projections to the contralateral hemisphere 
from L2/3 (Fig. 2g). 


Thalamocortical projections by source or class 


To investigate TC connections, we selected 81 out of 254 injection 
experiments in wild-type and Cre driver lines onthe basis of the extent 
of anatomical restriction to a single region. These experiments cov- 
ered 29 out of 44 thalamic nuclei in CCFv3 (Supplementary Tables 1, 
2, Extended Data Fig. 5a). Most thalamic nuclei that are known to 
contain cortically projecting neurons were included, except for the 
posterior triangular thalamic nucleus (PoT), suprageniculate nucleus 
(SGN), anterodorsal nucleus (AD), and interanteromedial nucleus of 
the thalamus (IAM)”°. 

We visually inspected the brain-wide axonal projection patterns and 
classified these 81 experiments using previous definitions for core, 
matrix and intralaminar TC projection classes'*”. Each experiment 
was manually assigned to one of four groups, or ‘none’ if no TC axons 
were observed (Fig. 3a): (1) core: labelled axons were observed ina 
small number of cortical targets with axons predominantly ramifying 
in L4; (2) intralaminar: labelled axons were predominantly observed 
inthe striatum, with weak or diffuse cortical axons present; (3) matrix 
(focal): labelled axons targeted L1in a small number of nearby targets; 
and (4) matrix (multiareal): labelled axons targeted L1 in a more dis- 
tributed set of targets. Most thalamic nuclei could be assigned to one 
class, although this does not preclude regions having mixed classes 
(Fig. 3b, Supplementary Table 1). Only three regions (all primary sen- 
sory thalamic nuclei) were assigned core-type projections—the ventral 
posterolateral nucleus (VPL), ventral posteromedial nucleus (VPM), and 
dorsal part of the lateral geniculate complex (LGd)!°”". Most thalamic 
sources were matrix or intralaminar. 

Unlike the cortex, whichis organized into distinct projection classes 
within layers of a single region, thalamic nuclei contain relatively 
homogenous populations of cortically projecting neurons™. As we 
used multiple Cre lines and wild-type mice for thalamic injections 
(Supplementary Table 1), we generated a TC connectivity matrix to 
compare the patterns of individual experiments (Fig. 3f). We manually 
identified true positive and true negative (including fibres of passage) 
connections for cortical targets (43 ipsilateral and 43 contralateral 
targets in 81 experiments, giving 6,966 connections manually checked; 
Supplementary Table 5), and performed hierarchical clustering onthe 
masked weights (Fig. 3c). Most sources with multiple injections clus- 
tered together, even those from different lines. Exceptions included 
the mediodorsal nucleus (MD), where further analyses showed that 
precise location mattered more than Cre line (MD-1 experiments are 
in mid-to-caudal MD, MD-2 experiments are inthe rostral portion). The 
specific patterns of cortical areas targeted by each cluster of thalamic 
nuclei were remarkably like the cortical modules defined by CC con- 
nections (Extended Data Fig. 5b). 


Nature | Vol575 | 7 November 2019 | 197 


Article 


eo 
Cux2-IRES-Cre 
L2/3 IT 


eo 
Scnn1a-Tg3-Cre 


L4 IT 
‘ | 
~ A Nr5a1-Cre™ 

-O- 

TIx3-Cre_PL56 
LS IT ; 
i 

bi! t 


oe 
Rbp4-Cre_KL100 
LS IT PT 


Ntsr1-Cre_GN220 “4 9 
L6 CT : : 
rx ® 


Shared [Rbp4-Cre only 


eo 
A93-Tg1-Cre 
L5 PT 


Other line only 


1.0 


@ 
& 
s 


0.8: 


0.6: | 


0.4 


204 

15 . 
10 ‘ 

0: 0.0: 


0.2 
Ipsi_ Contra ecoAAmooiomod 
Hemisphere Mouse line 


Avg.out-degree (CC) 
Fraction of postive targets 


Log,, normalized projection volume 


Fig. 2| Corticocortical projection patterns by layer and class. a, Forty-three 
groups of experiments, spatially matched to one Rbp4 anchor (green dots). 
Most group members were less than 500 um from the anchor (median, 296 pm). 
Green circles indicate the variance in distance to Rbp4 for each group. 

b-e, Data from three groups. b-d, Serial two-photon tomography (STPT) 
images at the centre of each injection site per Cre line were manually overlaid 
by finding the best match between the pial surface (top) and white matter 
boundary (bottom), then pseudocoloured by line. Scale bar, 250 pm. e, Top- 
down views of CC projections for spatially matched experiments. f, Directed, 
weighted connectivity matrices (27 x 86) for seven mouse lines: wild-type and 
the six Cre lines ine. Each row contains the log, )-transformed NPVs froma 
single experiment in one of 27 source areas. Columns show cortical target 


Corticothalamic projections by layer or class 


We also used the Rbp4 anchored cortical experiments to generate 
weighted CT connectivity matrices. We again identified true positive 
and true negative connections, this time for all thalamic targets (44 ipsi- 
lateral and 44 contralateral targets in 256 experiments, giving 22,528 CT 
connections manually checked; Supplementary Table 6). As expected 
from previous publications, most cortical projections to the thalamus 
were observed in LS PT and L6 CT Cre lines (and wild-type mice), with 
minimal to no true positive connections from L2/3, L4, and LSIT lines 
(Supplementary Table 6). We focused our subsequent analyses on Rbp4 
to represent L5 PT, given its more comprehensive coverage. Connec- 
tion strengths were significantly correlated between the L6 CT lines 
Ntsr1 and Syt6 (P< 0.0001, Fig. 4d), so we averaged or merged these 
data (Fig. 4b). 

The LS and L6 CT matrices appear similar, but have quantitative dif- 
ferences (Fig. 4a, b). Many thalamic targets receive inputs from both 
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regions. Rows and columns follow the same order in each matrix. White boxes 
highlight regions in the same module. True negatives and passing fibres were 
masked out (dark grey). Rows for which an experiment was missing (often 
because of low Cre expression) are light grey. The colour map ranges from 10° 
to10°°log NPV. It is truncated at both ends. g, Mean out-degrees (+s.e.m.) 
across all sources for each Cre line plotted for ipsilateral and contralateral 
cortex. h, The fraction of true positive targets shared by each line with its Rbp4 
anchor is shown in the box plot (grey). The fraction of positive targets unique to 
Rbp4 (green) or tothe line indicated (white) are also shown. Box plots show 
median and interquartile range (IQR). Whiskers show minimum and maximum 
values. 


layers (Fig. 4c), and the connection weights for shared targets is signifi- 
cantly correlated (P< 0.0001, Fig. 4f). However, this coefficient (0.65) 
is smaller than that between replicate experiments (0.84, Fig. 4e) and 
between the L6 lines (0.77, Fig. 4d). We calculated and visualized relative 
differences in input strength from LS and L6 for every source-target 
pair in the anchor group matrix (Fig. 4g). Some targets are contacted 
more, or less, by L5 or L6 depending on source region, but other targets 
have stronger L5 or L6 input regardless of source (bands of a single 
colour down a column in Fig. 4g). 

Notably, some CT projections clearly travel through thalamic regions 
before reaching their final targets, but also form synapses in those 
areas. Although entire regions containing only passing fibres were 
masked out, the remaining connections can contain a mix of fibres 
and terminals. To determine the effect of this kind of axonal trajec- 
tory on quantification of CT connection strengths, we compared a 
subset of spatially matched data sets in LS and L6 Cre lines using 
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Fig. 3 | Thalamocortical projection patterns by region and class. a, Left, flat 
map views show TC projections labelled from the region indicated. Right, STPT 
images from the centre of acortical target (asterisk on left) show example axon 
lamination patterns associated with each projection class. b, Key summarizes 
projection class assigned for 29 thalamic nuclei.c, The TC connectivity matrix 
(70 x 43) for individual viral tracer injection experiments with verified cortical 
projections. Each row shows log, -transformed NPVs from one experiment to 
the 43 ipsilateral cortical targets (columns). Cre line names for eachrowarein 
Supplementary Table 5). Unsupervised hierarchical clustering, using 
Spearman correlation and average linkages, revealed seven clusters containing 
thalamic regions with cortical projection patterns resembling the cortical 
modules. Matrix colour map as in Fig. 2. 


synaptophysin-EGFP to preferentially label terminals over fibres 
(Extended Data Fig. 6; see Methods). We found a strong linear relation- 
ship between measured connection strengths, and between relative 
differences between L5 and L6, with these two tracers, showing that 
EGFP tracer results can be used confidently for quantitative estimates 
of CT strengths. Nevertheless, this is an important consideration, as 
across the entire brain connection strengths from synaptophysin-EGFP 
experiments are on average lower than when using EGFP (Extended 
Data Fig. 1c), specifically by about 0.5 log units for Rbp4 CT targets 
(Extended Data Fig. 6k). 


Laminar termination patterns in cortex 

Using automated image registration to CCFv3, we quantified projection 
strengths by layer within each cortical target (registration precision in 
Extended Data Fig. 7a—c, Supplementary Table 7; see Methods). Then, 
to identify common laminar termination patterns across all sources 
and lines, we performed unsupervised hierarchical clustering with the 
complete data set (849 cortical and 81 thalamic experiments). Data had 
to pass three filters, as follows. (1) Target connection strength (log), 


-3.5 1.5 0.5 


log;, normalized projection volume 


i= 

a, 2820-90 oo 
sail = 

ShOUAISEPSSS: os 


° 


(Shared 


ee 


DANNNNNND| 
JO PODHPODD; SS 
joa aidedk ree 


ZS= 
Fraction of positive targets 


e eo} A 
Mouse line 
1 
g of. 
> 042 co 
oO 44e ys . 
4 ; 3 
> |S Were 
Z 34S tye 
= a4 
° 
3 ~~ Ntsr1_only, 


Se a i ir os | 
6-5-4 -3-2-10 1 
- Ntsr1-Cre_GN220 


54° 


log,, NPV_replicate2 @ 
q 


6 {2-2 _scemempmpemmens ©, 
6 -5-4-3-2-101 
Replicate 1 


log, NPV_Avg L6 line = 


6-5 4-32-4014 


Fig. 4 | Corticothalamic projections from layers 5 and 6.a, b, CT connectivity 
matrices (27 x 44) for LS (a, Rbp4) and L6 (b, average of Ntsr1 and Syt6). Each row 
shows log,,.-transformed NPVs from one of the 27 cortical source areas in Fig. 2 
to the 44 ipsilateral thalamic target regions (columns). c, The fraction of true 
positive CT targets shared by wild-type (black circle) and each L6 line (yellow) 
withits Rbp4 anchor is plotted in the box plot (grey). The fraction of positive 
targets unique to Rbp4 (green) or unique to the L6 line (white) are also shown. 
Box plots show median and IQR. Whiskers show minimum and maximum 
values. d, log NPVs for thalamic targets shared by Ntsr1 and Syt6 were 
significantly correlated (Spearman r= 0.77, P< 0.0001). e, log NPVs for thalamic 
targets shared by replicate experiments inthe same Cre line less than 500 pm 
apart were significantly correlated (Spearmanr=0.84, P< 0.0001). f, The 
average log NPVs originating from L6 are plotted against LS for all spatially 
matched experiments (Spearman r= 0.65, P< 0.0001). g, The matrix shows the 
relative difference for each source x target connection originating fromL5 
versus L6 (L5—L6/L5+L6). 


transformed NPV) was above -1.5. This threshold was chosen after 
analysing NPV frequency distributions for a set of manually verified 
true positive and true negative connections (Extended Data Fig. 7d). 
(2) Percentage of infection volume in the primary source was more than 
50%. (3) Self-to-self (within area) projections were removed. Following 
these steps, if present, multiple experiments were averaged, resulting 
ina total of 7,063 (660 thalamus, 6,403 cortex) unique source-line-tar- 
get connections (Supplementary Table 8). We identified nine clusters 
(Fig. 5a). The median relative density values for each layer and the over- 
all frequencies of these clusters are shown in Fig. 5b-d. Representative 
images from specific connections (agiven source-line-target) assigned 
toeach cluster froma cortical and thalamic source are shown in Fig. 5e. 

Asummary of cluster representation shows that each cortical Cre 
line and TC projection class is associated with more than one type of 
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Fig. 5|Corticocortical and thalamocortical target lamination patterns. 

a, Unsupervised hierarchical clustering on relative projection density per layer. 
Each columnisa unique combination of mouse line, cortical or thalamic source 
area, and cortical target. Connections to agranular (no L4) regions are coloured 
grey for L4. The dotted line indicates where the dendrogram was cut into nine 
clusters. b, Median relative density by layer for eachcluster.c, Number of 
cortical or thalamic connections in each cluster, plotted on the left and right y- 
axis, respectively. d, The frequency of cortical and thalamic targets assigned to 
each cluster. The dotted line indicates the overall frequency of CC targets inthe 
entire data set (90.53%). e, Representative STPT images show axonal 
lamination patterns froma connection assigned to each cluster from cortex or 
thalamus. In columns 4, 8, and 9, thalamic axons passing through the 


target layer pattern (Fig. 5f). The most common, and significantly 
enriched, laminar patterns from each are schematized in Fig. 5g 
(Fisher’s exact test, P< 0.0001). L2/3 and L4 (Nr5a1) neurons project 
predominantly to middle layers (L2/3, L4, and LS), avoiding L1. Other 
L4 neurons project to Ll and either L2/3 or LS, avoiding L4 and L6. In 
L5, when both IT and PT classes are labelled, as in the Rbp4 line, pro- 
jections target L6 and either L1 or L2/3. LS IT neurons predominantly 
target superficial layers (LlandL2/3).L5 PT neurons target either deep 
layers only (L5 and L6) or deep layers and L1, consistent with the LS 
Rbp4 patterns representing both IT and PT patterns. L6 CT neurons 
project predominantly to deep layers. From thalamic sources, core 
neurons project to L4 and either L5 or L6, intralaminar and matrix- 
focal neurons preferentially project to LS and L6, and connections 
coming from matrix-multiareal sources all project to L1, with differing 
proportions in other layers. 
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superficial corpus callosum are indicated (*cc). f, The relative frequency with 
which each cortical Cre line and TC projection class appears in the clusters. The 
fraction of experiments ina cluster belonging to each Cre line or class was 
divided by the overall frequency of experiments from that Cre line or class. A 
relative frequency value of 1 (white) indicates that the Cre line appeared in that 
cluster with the same frequency as in the entire data set. Values below1 (green) 
indicate lower and above 1 (pink) indicate higher than expected frequencyina 
cluster. Dots indicate significant positive enrichment in that cluster (Fisher’s 
exact test, P< 0.0001). g, Schematic diagram shows significantly enriched axon 
lamination patterns associated with each layer and/or class of origin inthe 
source area. 


Hierarchy of cortical and thalamic areas 


We hypothesized that the above anatomical rules could be used across 
all cortical and thalamic regions to build a testable hierarchical model 
to predict the direction of information flow. We used cortical Cre line 
experiments (Fig. 6) because they allow incorporation of the specific 
layer termination patterns related to cell class, but results are also 
provided using wild-type data (Extended Data Fig. 10). 

We used an unbiased approach to identify the most optimal label 
for each of the nine clusters; feedforward (FF) or feedback (FB). We 
defined an initial hierarchical position for each cortical area (as both 
asource and target) using the averaged difference between FB and FF 
connections, normalized by a confidence measure for each cortical 
Cre line (Eqs. (2-4) in Methods; see also results without this confidence 
term in Extended Data Fig. 10). We searched over all possible map- 
pings between the nine layer patterns and directional assignments, and 


a CC mapping b CT mapping ¢ 45 Shuffled Observed 
7 
FF . gasf| BSSrTS 
——_—_—_——. ea 3 +1C+ 
L1][6] TPASmeBsr £ 330 
vag Ss) 9 <= 20 
L4 | ol. D 045 
us| |i ed S10 
uy < 25 
Léa el, 0 
ee SS SF ef & xo 
TC mapping 


Global hierarchy score 


CC +TC+CT 


i Prefrontal Top 
7 Lateral A 
i Somatomotor e 0.254 
© Visual ‘ 
i Medial e 
=| Mi Auditory o 
Thalamus o 
o 
RB 4 
g 
fo} 
9 
a 
> 
= 
2 
g 
B 70.254 
x= 
n 
6 
2 
o 
2 
& PVT- = 
5 Bottom 
= VISp 
ne] 
& f CC+TC +CT 
sg Prefrontal 
2 
t 
fo) 
Oo 
QD 
© 
g 
o 
x 
2 
8 
IMD 
AM- 2 
MG- 2 
@we o 
cL - eo : © 
al : oO 
LGd eo : @cc 2 
PCN4 ee @cc+Tc 
veLje e 
VPM e : @cc+1c+cT 
T T i : : 
-1.0 -0.5 0 0.5 1.0 


Hierarchy score (all areas) 


Fig. 6 | Ahierarchical organization of areas and modules. a, Direction 
mapping results for CC and TC terminal layer patterns. Median relative density 
by layer for each cluster shown from Fig. 5b. b, Direction mapping for CT 
connections. Scatterplot shows the log, )-transformed NPVs for every CT 
connection fromLS and L6. Points are colour-coded by the mapping (FF or FB) 
predicted fromthe CC + TC hierarchy. Red line, linear discriminant analysis- 
assigned connections (below, FF; above, FB).c, Global hierarchy scores 

(Eq. (17)) for CC connections only (green), compared to the scores when TC and 
CT connections are included sequentially (pink, blue). Scores for the original, 
observed, data are shownas single outlined bars. Distributions of hierarchy 
scores were obtained from shuffled data sets (n=100). The medians of the 
shuffled distributions estimate the lower bound (0.001, 0.044 and -0.002 for 
CC,CC+TCandCC+TC+CT, respectively). d, Thirty-seven cortical areas and 
twenty-four thalamic nuclei rank-ordered by their CC + TC + CT hierarchy 
scores. Scores for each area using only CC or CC + TCconnections are also 
plotted. y-axis labels are colour-coded by module assignment for cortical 
areas. e, Network diagram showing interconnections of all cortical visual areas 
(light blue, visual module; dark blue, medial module). Edge width represents 
relative connection density from Fig. le. The curved lines show outputs (left) 
and inputs (right) to each node. Nodes are positioned along a single axis based 
on hierarchical score. f, Intermodule network diagram. Edge width represents 
sum of connection densities from Fig. le. 


determined which mapping resulted in the most self-consistent initial 
hierarchy (maximized the CC global hierarchy score, which measures 
how consistent the obtained hierarchy is with directions of individual 
connections; Eq. (5) in Methods). For CC connections, clusters 2, 6, 
and 9 were assigned to one direction, and 1, 3, 4, 5, 7 and 8 to the other 
(Fig. 6a). For TC projections, clusters 2 and 6 were assigned to the same 
direction and the rest to the other (Eqs. (8-10) in Methods). Cluster 
9 switched directions for CC and TC. We confidently labelled these 


two directions as either FF or FB on the basis of extensive anatomical 
analyses relating our observed layer patterns to known hierarchical 
order and rules between reciprocally connected regions» °° (Extended 
Data Figs. 8, 9). After obtaining initial hierarchy positions with the most 
optimal mappings, scores were iterated to further refine the hierarchy 
(Eqs. (6, 7, 11, 12) in Methods). 

To label CT connections, we used the Cre-defined L5 and L6 projec- 
tion strengths (thresholded by log,,-transformed NPV > -2.5; Extended 
Data Fig. 7e; Eq. (13) in Methods). Linear discriminant analysis was 
applied to assign CT connections into the FF or FB class that was most 
self-consistent with the direction predicted froma CT hierarchy con- 
structed first from CC and TC projections (Fig. 6b, Supplementary 
Table 9; Eq. (14) in Methods). 

Using these rules, we obtained three versions of hierarchy based on 
(1) CCconnections only, (2) CC and TC connections, and (3) CC, TC, and 
CT connections. We demonstrate that there is significant hierarchical 
organization by comparing global hierarchy scores with corresponding 
distributions of scores from shuffled connections (Fig. 6c, see z-scores 
in Extended Data Fig. 10f). Adding thalamus connections essentially 
doubled the hierarchy scores (0.069, 0.120, and 0.128 for CC, CC+TC, 
andCC+TC+CT, respectively). Nonetheless, by comparing the global 
hierarchy scores with their maxima (0.679, 0.636, and 0.683; see Meth- 
ods), it appears to bea rather shallow hierarchy. 

The final hierarchical positions for 37 cortical areas and 24 thalamic 
nuclei are presented in Fig. 6d (scores in Supplementary Table 9). Most 
thalamic regions are located at the bottom or top, suggesting that they 
have pure driver or modulator effects onthe cortical areas with which 
they are connected. Several thalamic nuclei appear mid-hierarchy, indi- 
cating more balanced numbers of FF and FB connections. For cortical 
regions, primary visual cortex is at the bottom and the prefrontal area 
ORBvI (ventrolateral part of the orbital area) is at the top. Predicted 
hierarchical positions were broadly similar across the three versions 
(CC, CC + TC, or CC + TC + CT). Most regions had only minor shifts in 
position. The largest shifts occurred in thalamic regions when adding 
CT connections. Hierarchies were also exceedingly robust to contribu- 
tions from any single Cre line, or layer or projection class (Extended 
Data Fig. 10h). 

We used similar methods to predict hierarchy for subsets of areas 
(visual cortex, Fig. 6e) and between modules (Fig. 6f). The intermodule 
hierarchy had a relatively low global score (0.07) compared to the all- 
area hierarchy (0.13), but it was more obviously organized into distinct 
levels; primary sensory modules at the bottom, lateral and medial 
modules in the middle, and prefrontal at the top. 


Discussion 


We used a genetic viral tracing approach, building on our previously 
established whole-brain imaging and informatics pipeline, to map pro- 
jections originating from unique cell populations inthe same cortical 
area, and from distinct projection classes in the thalamus. Our study 
represents a big step towards a true mesoscale connectome”. It will 
be informative for future connectome studies with more refined cell 
types and single cells*®”, which will no doubt reveal additional prin- 
ciples of cell-type-specific brain connectivity*’. With these mesoscale 
data, we derived several generalizable anatomical rules of cortical and 
thalamic connections, and tested whether the organizing principle of 
a hierarchy applies to mouse cortex and thalamus. 

The cortex is organized as a modular network?" which provides 
a structural view of possible paths of information flow, but does not 
impose direction or order onto that flow. By contrast, a hierarchy 
implies that interareal connections belong to at least two general types: 
feedforward or feedback. Specific anatomical projection patterns were 
previously associated with information transmission in these directions 
in primate and rodent visual cortex”?*"*., In our data, we observed 
many similar patterns. Two patterns that differed were the superficial 
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layer projections (cluster 1) and the deep layer projections (cluster 9). 
Felleman and van Essen” noted the occasional superficial-only pattern, 
but they called it feedback because it did not involve L4. Our results 
suggest this pattern is associated with feedforward. The strength and 
presence of projections between areas from the predominantly L4 
Cre lines was also unexpected, given canonical circuit diagrams“, and 
might be explained by varying degrees of layer selectivity. However, 
by reconstructing the complete dendritic and axonal morphology of 
single cells, we directly show that L4 neurons, even spiny stellate cells, 
can in fact have long-range projections. 


The hierarchy that we find is shallower than might have been 


expected, even with inclusion of thalamic regions. The difference 
between the lowest and the highest areas is less than two full levels, 
and the all-area hierarchy global score is at 19% between random and 
perfectly hierarchical. This might be characteristic of the mouse cortex, 
given its high connection density, particularly when considering all 
non-zero connection strengths*®. We did not explicitly include strengths 
in computing hierarchy, except that weak connections were removed. 
Notably, hierarchical position alone does not explain all of the connec- 
tions ofa given area. This complexity may be why some have argued that 
the concept ofa hierarchy is overly simplistic for describing functional 
properties**. Given the number of different connection types that arise 
froma single area, future computational models that incorporate more 
than feedforward and feedback labels will enable further insights into 
the organization of brain networks. 


Cortical hierarchies were previously derived from classic antero- 


grade or retrograde tracing without cell-class resolution. Using Cre 
lines, we have mapped both layer of origin and target lamination pat- 
tern inthe same experiment. We found that L2/3 and L4 neurons have 
predominantly feedforward layer projection patterns, whereas L5 and 
L6 neurons have both feedforward and feedback patterns. However, 
these general relationships depend on the specific source-target con- 
nection and Cre line. The Cre data set, with all this detail, produced the 
most robust hierarchy (Extended Data Fig. 10f). However, our results 
from wild-type mice provide a solid benchmark for others interested 
in applying these hierarchical model algorithms to classic tracing data. 
The calculation of global hierarchy scores for other datasets will enable 
direct comparisons between species and quantitative assessments of 
how development or disease might affect hierarchical organization. 


Online content 
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maries, source data, extended data, supplementary information, 
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Methods 


Mice 

Experiments involving mice were approved by the Institutional Animal 
Care and Use Committees of the Allen Institute for Brain Science in 
accordance with NIH guidelines. Sources of mouse lines are listed in 
Supplementary Table 1. Transgene expression patterns in many Cre 
driver lines used in this study have been previously characterized” 
and are available through the Transgenic Characterization data portal 
(http://connectivity.brain-map.org/transgenic). Cre lines were origi- 
nally derived on various backgrounds, but the majority were crossed 
to C57BL/6J mice for more than ten generations and maintained as het- 
erozygous lines upon arrival. Tracer injections were performed in male 
and female mice at an average age of P56 + 10 days. Mice were group- 
housed witha 12-h light-dark cycle. Food and water were provided ad 
libitum. No statistical methods were used to predetermine sample 
size. The experiments were not randomized and investigators were not 
blinded to allocation during experiments and outcome assessment. 


Tracers and injection methods 

rAAV was used as an anterograde tracer. For most regions, stereotaxic 
coordinates were used to identify the appropriate location for atracer 
injection. Atlas-derived stereotaxic coordinates were chosen for each 
target area based on The Mouse Brain in Stereotaxic Coordinates”. 
Anterior—posterior (AP) coordinates are referenced from bregma, 
medial-lateral (ML) coordinates are distance from midline at bregma, 
and dorsal-ventral (DV) depth is measured from the pial surface of 
the brain. Stereotaxic coordinates used for each experiment can be 
found through the data portal. For a subset of experiments in the left 
hemisphere, we first functionally mapped the visual cortex using intrin- 
sic signal imaging (ISI) through the skull, described below to assist in 
targeting injections. A pan-neuronal AAV expressing EGFP (rAAV2/1. 
hSynapsin.EGFP.WPRE.bGH, Penn Vector Core, AV-1-PV1696, Addgene 
1D 105539) was used for injections into wild-type C57BL/6J mice (stock 
no. 00064, The Jackson Laboratory). To label genetically defined 
populations of neurons, we used either a Cre-dependent AAV vector 
that robustly expresses EGFP within the cytoplasm of Cre-expressing 
infected neurons (AAV2/1.pCAG.FLEX.EGFP.WPRE.bGH, Penn Vector 
Core, AV-1-ALL854, Addgene ID 51502) or, a Cre-dependent AAV virus 
expressing a synaptophysin-EGFP fusion protein to more specifically 
label presynaptic terminals (AAV2/1.pCAG.FLEX.sypEGFP.WPRE.bGH, 
Penn Vector Core). 

Functional mapping of visual field space by ISI was used in some 
cases to guide injection placement. Additional details of this procedure 
can be found at http://help.brain-map.org/display/mouseconnectiv- 
ity/Documentation?preview=/2818171/10813533/Connectivity_Over- 
view.pdf). In brief, a custom 3D-printed headframe was attached to the 
skull, centred at 3.1 mm lateral and 1.3 mm anterior to lambda on the 
left hemisphere. A transcranial window was made by securing a 7-mm 
glass coverslip onto the skull in the centre of the headframe well. Mice 
were allowed to recover for at least seven days before ISI mapping. 
ISI was then used to measure the haemodynamic response to visual 
stimulation across the entire field of view of a lightly anaesthetized, 
head-fixed, mouse. The visual stimulus consisted of sweeping a bar 
containing a flickering black-and-white checkerboard pattern across 
a grey background“. To generate a map, the bar was swept across the 
screen ten times in each of the four cardinal directions, moving at 9° 
per second. Processing of sign maps followed methods previously 
described”, with minor modifications. Phase maps were generated 
by calculating the phase angle of the pre-processed discrete Fourier 
transform at the stimulus frequency. The phase maps were used to 
translate the location of a visual stimulus displayed on the retina to 
a spatial location on the cortex. A sign map was produced from the 
phase maps by taking the sign of the angle between the altitude and 
azimuth map gradients. Averaged sign maps were produced froma 


minimum of three time series images, for acombined minimum aver- 
age of 30 stimulus sweeps in each direction. Visual area segmentation 
and identification was obtained by converting the visual field map 
into a binary image using a manually defined threshold and further 
processing the initial visual areas with a split/merge routine”. Sign 
maps were curated and the experiment repeated if (1) fewer than six 
visual areas were positively identified; (2) retinotopic metrics of VISp 
were out of bounds (azimuth coverage within 60-100° and altitude 
coverage within 35-60°); or (3) auto-segmented maps needed to be 
annotated with more than three adjustments. Each animal had three 
attempts to get a passing map. 

ISI] images were acquired using a pair of Nikon lenses (Nikkor 135 
mm f/2.8 lens and 50 mm f/1.8), providing a magnification of 2.7x. Illu- 
mination was from a ring of sequential and independent LED lights, 
with green (peak wavelength of 527 nm and full width half maximum 
(FWHM) of 50 nm; Cree, CSO03B-GCN-CYOCO791) and red spectra 
(peak wavelength of 635 nm and FWHM of 20 nm; Avago Technologies, 
HLMP-EGO8-Y2000), viaa bandpass filter (630/92 nm, Semrock, FFO1), 
and acquired with asCMOS camera (Andor, Zyla 5.5 10-tap). IIlumina- 
tion and image acquisition were controlled with in-house GUI software 
written in Python. An image of the surface vasculature was acquired 
with green LED illumination to provide fiduciary marker references 
onthe surface of the brain. 

All mice received one unilateral injection into a single target region. 
For injections using stereotaxic coordinates from bregma as a regis- 
tration point, procedures were followed as previously described’. For 
ISI-guided injections, the glass coverslip of the transcranial window 
was removed by drilling around the edges and asmall burr hole drilled, 
first through the Metabond and then through the skull using surface 
vasculature fiducials obtained from the ISI session as a guide. An overlay 
of the sign map over the vasculature fiducials was used to identify the 
target injection site. rAAV was delivered by iontophoresis with current 
settings of 3 pA at 7 s‘on’and 7 s ‘off’ cycles for 5 min total, using glass 
pipettes (inner tip diameters of 10-20 pm). 

Some injections were done into lines with regulatable versions of 
Cre. Tamoxifen-inducible Cre line (CreER) mice were treated with 0.2 
mg/g body weight of tamoxifen solution in corn oil via oral gavage 
once per day for 5 consecutive days starting the week after virus injec- 
tion. Trimethoprim-inducible Cre line (dCre) mice were treated with 
0.3 mg/g body weight of trimethoprim solution in 10% DMSO via oral 
gavage once per day for 3 consecutive days starting the week after 
virus injection. For these Cre lines, brains were removed 4 weeks after 
the rAAV injection date as opposed to 3 weeks. All mice were deeply 
anaesthetized before intracardial perfusion, brain dissection, and tissue 
preparation for serial imaging as previously described’. 


Serial two-photon tomography and image data processing 
Imaging by STPT (TissueCyte 1000, TissueVision Inc. Somerville, MA) 
has been described’”°, and here we used the exact same procedures 
as in our earlier published studies’™. In brief, following tracer injec- 
tions, brains were imaged using STPT at high x-y resolution (0.35 um 
x 0.35 um) every 100 pm along the rostrocaudal z-axis, after which the 
images underwent quality control and manual annotation of injection 
sites, followed by signal detection and registration to the Allen Mouse 
Brain Common Coordinate Framework, version 3 (CCFv3) through our 
informatics data pipeline (IDP). 

The IDP manages the processing and organization of the image and 
quantified data for analysis and display in the web application, as pre- 
viously described’. The two key algorithms are signal detection and 
image registration. Previous methods were implemented, except that 
two variations of the segmentation algorithm were employed, depend- 
ing onthe virus used for that experiment; one was tuned for EGFP detec- 
tion and one for SypEGFP detection. High-threshold edge information 
was combined with spatial distance-conditioned low-threshold edge 
results to form candidate signal object sets. The candidate objects 
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were then filtered on the basis of morphological attributes such as 
length and area using connected component labelling. For the SypEGFP 
data, filters were tuned to detect smaller objects (punctate terminal 
boutons versus long fibres). In addition, high-intensity pixels near 
the detected objects were included in the signal pixel set. Detected 
objects near hyper-intense artefacts that occurred in multiple chan- 
nels were removed. We developed an additional filtering step using a 
supervised decision tree classifier to filter out surface segmentation 
artefacts, based on morphological measurements, location context 
and the normalized intensities of all three channels. 

The output is a full resolution mask that classifies each 
0.35 pm x 0.35 um pixel as either signal or background. An isotropic 
3D summary of each brain is constructed by dividing each image series 
into 10 um x 10 um x 10 um grid voxels. Total signal is computed for 
each voxel by summing the number of signal-positive pixels in that 
voxel. Each image stack is registered in a multi-step process using both 
global affine and local deformable registration to the 3D Allen mouse 
brain reference atlas as previously described’. Segmentation and reg- 
istration results are combined to quantify signal for each voxel in the 
reference space and for each structure in the reference atlas ontology 
by combining voxels from the same structure. 

Once an image series had passed quality control steps, injection 
site polygons overlaying the cell bodies of infected neurons were 
manually drawn. These polygons were informatically warped into 
the CCFv3 atlas space. Green channel signal intensity within the poly- 
gons was used to identify which structures have been injected, and 
to quantify the relative magnitude of their infections. The structure 
that received the largest proportion of signal intensity was identified 
as the primary injection site structure, and all other structures were 
considered secondary structures containing infected cells. A quanti- 
fied injection summary is provided for each image series through the 
data portal that shows the relative amount of signal detected within 
each infected structure. 


Quantification of projection strengths using segmentation and 
registration 

Projection signals can be quantified in several ways using our informat- 
ics pipeline (see SDK help: https://allensdk.readthedocs.io/en/latest/ 
connectivity.html#structure-level-projection-data). Here, we most fre- 
quently report ‘normalized projection volume’, which is the volume of 
detected projection signals in all voxels ina structure (inmm?*), divided 
by the total volume of detected signal in the manually annotated injec- 
tion site. We also use the ‘normalized connection densities’ output from 
the voxel-level interpolation model for modularity analyses in Fig. le. 
Connection density is the sum of detected projection pixels divided by 
the sum ofall pixels in that voxel or structure. Normalized connection 
density is this value divided by the injection site density. 

Itis important to note that even after undergoing our quality con- 
trol procedures, these informatically derived measures of connection 
strength caninclude artefacts (false positives), and, particularly for the 
EGFP tracer, report total signal from labelled axons, including passing 
fibres and synaptic terminals. For this reason, we performed extensive 
manual checking of all CC, CT, and TC targets to remove any signals 
from regions in which we could not identify any true positive axons 
or terminals (see main text). 


Morphological reconstruction of single L4 neurons 

The Cux2-IRES-CreERT2 driver line was crossed with a TIGRE2.0 
reporter line’, Ail66, also known as TIGRE-MORF~. In brief, Ail66 
expresses a Cre-dependent MORF transgene, composed of a 
farnesylated EGFP preceded by astretch of 22 guanidine nucleotides 
(22G-GFPf), which puts the transgene out of frame. Rare DNA replica- 
tion errors lead to the deletion of one G, correcting the frameshift, 
and leading to GFPf expression. Combining Ai166 with a CreERT2 
line and giving mice alow dose of tamoxifen produces sparse cellular 


labelling that is well suited for 3D morphological reconstruction™. 
High-resolution whole-brain imaging by fMOST has been described 
previously™; similar protocols were used here to image the Cux2- 
IRES-CreERT2;Ai166 brain. Specifically, high-resolution block-face 
fluorescence imaging was done in coronal planes. Using a diamond 
knife, 1.0-um sections were removed before we imaged subsequent 
planes. The process was repeated through the entire rostral-caudal 
extent of the mouse brain, producing more than 10,000 images with 
a resolution of 0.3 x 0.3 x 1 um (x x y x z). Following acquisition of 
the complete fMOST image stack, it was converted into a multi-level 
navigable data set using the Vaa3D-TeraFly program®, and then recon- 
structions were performed using Vaa3D-TeraVR software tools built 
to facilitate semi-automated and manual reconstructions®. 


Creation of the cortical top-down and flattened views of the 
CCFv3 for data visualization 

Astandardz-projection of signal ina top-down view of the cortex mixes 
signal from multiple areas. Visualizations of fluorescence in Figs. 1-3 
instead project signal along a curved cortical coordinate system that 
more closely matches the columnar structure of the cortex. This coor- 
dinate system was created by first solving Laplace’s equation between 
pia and white matter surfaces, resulting in intermediate equi-potential 
surfaces. Streamlines were computed by finding orthogonal (steepest 
descent) paths through the equi-potential field. Cortical signal canthen 
be projected along these streamlines for visualization. 

A cortical flatmap was also constructed to enable visualization of 
anatomical and projection information while preserving spatial con- 
text for the entire cortex. The flatmap was created by computing the 
geodesic distance (the shortest path between two points on a curve 
surface) between every point on the cortical surface and two pairs 
of selected anchor points. Each pair of anchor points forms one axis 
of the 2D embedding of the cortex into a flatmap. The 2D coordinate 
for each point on the cortical surface is obtained by finding the loca- 
tion such that the radial (circular) distance from the anchor points 
(in 2D) equals the geodesic distance that was computed in 3D. This 
procedure produces smooth mapping of the cortical surface onto a 
2D plane for visualization. This embedding does not preserve area 
and the frontal pole and medial-posterior region is highly distorted. 
As such, all numerical computation is done in 3D space. Similar tech- 
niques are used for texture mapping on geometric models in the field 
of computer graphics”. 


Network modularity analysis 

The matrix of connection weights between cortical areas (Fig. le) was 
obtained from a model of voxel-level connectivity”. We analysed the 
network structure of this graph using the Louvain Community Detec- 
tion algorithm from the Brain Connectivity Toolbox (https://sites. 
google.com/site/bctnet/)*°**. We determined the modularity metric 
(Q) at various levels of granularity by varying the resolution parameter, 
y, from Oto 2.5in steps of 0.1. Q quantifies the fraction of connections 
inside modules minus the fraction of connections expected inside the 
same modules if the network was connected randomly; that is, Q=0 
has no more intramodule connections than expected by chance, while 
Q> Oindicates a network with some community structure. 

For each value of y, the modularity was computed 1,000x and each 
pair of regions received an affinity score between O and 1. The affin- 
ity score is the probability of two regions being assigned to the same 
module weighted by the modularity score (Q) for that iteration, thereby 
assigning higher weights to partitions with a higher modularity score. 
Each region was assigned to the module with which it had the highest 
affinity, with the caveat that all structures within a module had an affin- 
ity score >0.5 with all other members of the module. For each value of 
y, we also generated a shuffled matrix containing the same weights but 
with the source and target regions randomized. The modularity for the 
cortical matrix (Q) and the shuffled matrix (Qynurnea) Were evaluated at 


each value of y. As stated in the results, we chose to focus on the mod- 
ules identified at y=1.3 (Q=0.36) where the difference between Q and 
Qsnutfled WaS at its peak (0.22 + 0.017, mean+s.d.), although it should be 
noted that it was relatively stable between y=1and y=1.8 (0.21+ 0.019 
at y=1, 0.20 + 0.012 at y = 1.8). Modules were identical from y =1.3 to 
y=1.5 and showed only minor differences for y between 1 and 2. 


Statistics and reproducibility 

We used the software program GraphPad Prism for statistical tests and 
generation of graphs, and the software program Gephi for visualiza- 
tion and layout of network diagrams”. The exact numbers of tracer 
injection experiments per mouse line and source area are shown in 
Supplementary Table 1, and range fromn=1to 31. Notall experiments 
were independently repeated because we sought to balance the need 
for broad coverage across Cre lines and source areas with excessive 
animal use. Previously, we demonstrated that n=1is a good predictor 
of connectivity strengths across multiple animals!. In this study, we also 
show that the correlations between brain-wide projection strengths 
from experiments at matched locations within the same mouse line 
are consistent, positive, and significant (Spearmanr>0.8,P<0.0001, 
Extended Data Fig. 1). Sample sizes for analyses presented in all figures 
are mostly noted in the main text, and can also be found in associated 
Supplementary Tables. Specifics include: for Fig. 2g, h:n = number of 
mice per line, for wild-type, Cux2, Rbp4: n= 27, Syt6:n=23,A93:n=22, 
T1x3, Ntsr1:n=21, Scnnia-Tg3:n=19, Chrna2, Efr3a:n=15, NrSal:n=10, 
Sim1:n=9, Rorb: n= 6, Sepw1: n=5; for Fig. 4c: n=number of mice per 
line, for wild-type, Rbp4: n=27, Syt6: n=23, and Ntsr1: n= 21; for Fig. 4d: 
n=1,158 total CT connections, 462 are shared above threshold, for 
Fig. 4e:n=1,892 total CT connections, 628 are shared above threshold, 
for Fig. 4f: n=1,158 total CT connections, 495 are shared above thresh- 
old; for Fig. 5a: n = 7,063 unique connections (columns). Numbers of 
replicate experiments per each of the 7,063 connections ranged from1 
to 53, and are listed in Supplementary Table 8; for Fig. 5f: the number of 
connections assigned to each cluster is plotted in Fig. 5c, and can also be 
found in Supplementary Table 8 (cluster 1:n =1,740, cluster 2:n=366, 
cluster 3: =375, cluster 4:n=228, cluster 5:n=602, cluster 6:n=2,224, 
cluster 7:n=102, cluster 8:n=129, cluster 9:n=1,297). The number of 
connections per cortical Cre line can also be found in Supplementary 
Table 8, for A93:n=375, C57Bl6/J/Emx1: n=1,431, Chrna2: n=136, Cux2: 
n=703, Efr3a:n=223, NrSal:n=251, Ntsr1:n=246, Rbp4:n=1,149, Rorb: 
n=185; Scnnia-Tg3:n=263, Sepw1: n=140, Sim1:n=108, Syt6:n=150, 
Tlx3: n=1,043), and per thalamic projection class was, for core:n=62, 
matrix-focal: n=136, intralaminar: n=160, matrix-multiareal:n=302. 
Figure 6b: n = 385 total CT connections. 


Clustering analyses 

Unsupervised hierarchical clustering was conducted with the online 
software, Morpheus (https://software.broadinstitute.org/morpheus/). 
log-transforms were calculated on all values after adding a small value 
(0.5 minimum of the true positive array elements) to avoid log (0). 
Proximity between clusters was computed using average linkages with 
Spearman rank correlations as the distance metric. Relative layer den- 
sity is the fraction of the total projection signal in each layer, scaled by 
the relative layer volumes in that target. The clustering algorithm works 
agglomeratively: initially assigning each sample to its own cluster and 
iteratively merging the most proximal pair of clusters until finally all 
the clusters have been merged. To compare distances between granular 
and agranular samples (those that lack L4), we used the median of the 
other present layers for L4. 


Unsupervised discovery of hierarchy position 

Following the classification of the laminar patterns into nine clusters 
of CC and TC connections, we used an unsupervised method to 
simultaneously assign a direction to acluster type and to constructa 
hierarchy. 


We first defined hierarchy scores of cortical regions based on layer- 
termination patterns of CC connections. First consider a mapping 
function M,, for CC connections: 


Mcci 1, ..., 93 >{-LB (1) 


which maps a type of connection cluster (C;,, € {1,..., 9}, where CG, 
denotes the layer termination pattern of the connection from areajto 
areaifor Cre line 7) to either feedforward (M,.-=1) or feedback (M.,=~-1) 
type. We search over the space of possible maps to see which map pro- 
duces the most self-consistent hierarchy. As some transgenic lines have 
different numbers of connections in different clusters, some maps will 
lead to particular transgenic lines having very biased feedforward or 
feedback calls. Thus, we add a confidence measure (conf(7)) for each 
Cre line T, which decreases the importance of the information provided 
by atransgenic line to the CC global hierarchy if the calls from that 
transgenic line are biased. This allows us to reduce the bias inthe regions 
where experiments used more Cre lines that predominantly mark feed- 
forward or feedback connections. The Cre-dependent confidence 
measure is defined as: 


conf( T) =l1- KMelCy, DiI (2) 


with a global confidence as an average over all the inter-areal connec- 
tions above the threshold (107°) 


confg =<conf(T)); ; (3) 


We define the initial hierarchical position of an area as: 
1 
H9= (( Mec(C;,) <conf(T) ) 2 (Mec(C,,) x conf(T) )) (4) 


The first term, (Mc Cy, 2) x conf(T)), describes the average direction 
of connections to areai, and thus represents the hierarchical position of 
the area as a target. The second term, ~(MeclCr, ) xconf(T)), on the 
other hand, represents the average direction of connections from area 
i, depicting the hierarchical position of the area as a source. The hier- 
archical position ofa cortical area is the average between its hierarchi- 
cal position as source and target. 

To test how self-consistent a hierarchy is we define the CC global 
hierarchy score: 


1 
Ncc= cont? MelGr,) x conf(T) x (HP-H%)); ; (5) 


We performed an exhaustive search over all the maps M,, for the entire 
set of CC connections, and the most self-consistent hierarchy that 
maximizes the CC global hierarchy score is obtained when connec- 
tions of clusters 2, 6 and 9 are of one direction and 1, 3,4,5, 7 and 8 are 
of the opposite direction. As described in the main text, we conclude 
that clusters 2,6 and 9 are feedback connection patterns, and the other 
group of clusters corresponds to feedforward. 

The initial hierarchy score (H?) of each area iis thus obtained by 
computing the average direction of connections to and from the area 
(Eq. (4)) while concurrently searching for the optimal mapping of each 
lamination pattern to either the feedforward or feedback direction, 
and is bounded by —1and 1. After we obtain the initial positions in the 
hierarchy, the hierarchy scores of all cortical regions are iterated until 
the fixed points are reached, to refine the cortical hierarchy. Without 
iterations, the hierarchy scores account only for the number of feed- 
forward and feedback connections each area receives or sends out. 
Therefore, the initial hierarchy obtained by Eq. (4) alone does not 
account for the hierarchy positions of the target and source areas that 
each cortical area makes connections to, and places any two areas with 
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the same number of feedforward and feedback connections at the 
same level in the hierarchy. To address this issue, we implement atwo- 
step iterative scheme: 


a 1 a = 
Hp = KH + McclCr,)y— 6 HG + McCr, Yh (6) 
1 gel 
HP =H; (a , (7) 
Jj 


where nrefers to iterative steps. The first part (Eq. (6)) refines the hierar- 
chy score of areaion the basis of the current hierarchy scores of its target 
and source areas. The next part (Eq. (7)) subtracts the hierarchy scores 
averaged over all areas to remove possible drifts. At every iteration 
step we also check to see whether the mapping of connection clusters 
to feedforward or feedback connection needs to change; however, it 
remained constant through the iterations. We found that the hierarchy 
scores reach the fixed points after just a few (<5) iterations, and used 20 
iterations to find the final hierarchy scores of all areas. These final hier- 
archy scores are denoted as the hierarchy obtained by CC connections. 

Next, we examined whether and how the TC connections affect the 
cortical hierarchy, by incorporating layer termination patterns of TC 
connections in addition to the CC connections. Asin CC connections, 
TC connections are clustered into nine types on the basis of their layer 
termination patterns. The mapping of the lamination patterns is based 
onthe hierarchy scores of cortical regions obtained by CC connections, 
while the hierarchical positions of thalamic areas relative to the cortical 
areas are found concurrently. The mapping of the TC layer termination 
types to directions is defined as: 


Mri, O{-LB (8) 


similar to the mapping of CC connections in Eq. (1). Because thalamic 
areas are always the source in TC connections, the initial hierarchy 
score of each thalamic area iis defined by the average direction of con- 
nections from the area: 


(9) 


min(Nee, Ney) 
HP =- ( Myc(Cy, ) x ee | 


Net Ney 


where the mapping of the lamination patterns, M,, is obtained by 
searching for the most self-consistent assignment that maximizes 
the TC global hierarchy score h;-: 


hic = (MrelCr) x (HP - H9) ~ (10) 


min(Nee, Ne) 
Net N /i; 


The parameters N,, and Ng, refer to the numbers of feedforward and 
feedback TC connections, respectively. The multiplier mini Me) biases 
ff fb 


the optimization method to preferentially search for mappings that 
result in roughly equal numbers of feedforward and feedback connec- 
tions. Without such a weight on equal divide of the connections, the 
search algorithm decides TC connections to be always feedforward, 
placing all thalamic areas below cortical areas. 


As with CC connections, we performed an exhaustive search over 
all the maps M,, for the entire set of TC connections to find the most 
self-consistent hierarchy that maximizes the TC global hierarchy score. 
For TC connections, we found that connections of cluster 2 and 6 are 
of one direction and the rest of the clusters are of the opposite direc- 
tion. Again, as described in the main text, we conclude that clusters 2 
and 6 are feedback, and the rest correspond to feedforward patterns. 

Once the initial positions of the thalamic areas in the hierarchy are 
obtained, hierarchy scores of thalamic and cortical areas are iterated 


until the fixed points are reached (20 iterations), using a full mapping 
function Mcc.z¢ that combines M;, and M;, for CC and TC connections, 
respectively: 


. 1 = ss 
Hp = KH + MccurlCr, Dy (HG + Mecorlr, 3 AD 
(12) 


Finally, the effect of CT connections on the hierarchy is considered. 
Either feedforward or feedback direction is assigned to CT connections 
depending on the cortical layer from which the connections originate. 
Specifically, we classified CT connections based on the log,,-trans- 
formed NPV from layers 5 and 6 of the source areas. Therefore, the 
mapping of CT connections is described by: 


Mcr:[L5log,,NPV, Lé log, NPV] > {-LB (13) 
We first determined the predicted direction (feedforward or feedback) 
of each CT connection based on the hierarchy constructed from CC 
and TC projection patterns. These directions of CT connections show 
mixed LS and L6 source expressions. To classify the CT connections 
to either LS or L6 dominance and, subsequently, to feedforward or 
feedback, we used linear discriminant analysis on log, -transformed 
NPV values of L5 and L6 lines with a prior that biases the method to 
yield about equal numbers of LS and L6-dominant connections. The 
classifier assigns feedforward direction to connections with stronger 
L5 source NPV, and feedback direction to L6 dominant connections, 
using the linear separator. Once directions of CT connections have been 
obtained, the mappings M,-, M,,. and M,; are combined to construct 
a comprehensive mapping Mcc.rcicr Of all connections among corti- 
cal and thalamic areas to directions. The initial positions of thalamic 
regions in the hierarchy are computed by: 

HP = 5 (Mrcacr(Cr,))y~ MreverCr,))) (4) 
where M,,,¢; is the mapping ofall TC and CT connections. Note that the 
multiplier ™r-%® used for initial thalamic hierarchy with TC connec- 


tions only (which biases thalamus to be towards the centre of the hier- 
archy) is not needed here, owing to the presence of the CT connections 
inthe computations. However, the bias is not fully eliminated as it influ- 
enced the initial assignment of CT and TC connections types to be feed- 
forward or feedback. The initial hierarchy scores are iterated together 
with hierarchy scores of cortical areas obtained from Eqs. (6, 7): 


a 1 = = 
H? MESH) “4 Mccstc+crlGy, )y— 4 HG “+ Mecerc+cr(Cr, Dj (15) 


a1 a1 
H!'=H; (Hh ) (16) 
j 


In this way, we obtained three different versions of cortical hierarchy 
constructed from: (1) CC connections only, (2) CC connections and 
thalamocortical connections, and (3) CC, TC, and CT connections. 

We examined how the additional information provided by TC and CT 
connections affects the self-consistency of the hierarchy by comparing 
the global hierarchy scores of the three different versions of hierarchy. 
For this purpose, we compared the global hierarchy scores without any 
confidence or weight multiplier: 


ne(u(c,)«(4-)) 


tf 


(17) 


In addition to the hierarchy of all areas, we also constructed the 
intermodule hierarchy of cortical areas. We used the same mappings 
obtained from construction of the all-area hierarchy to classify the 
lamination patterns. For intermodule hierarchy, all the connections 
to and from each module were used to build the hierarchy among the 
modules. 


Global hierarchy score of shuffled connectomes and ‘perfectly 
hierarchical’ connectome 
To evaluate ‘how hierarchical’ the mouse brain is, we generated shuffled 
connectivity data for the connection patterns, computed the global 
hierarchy scores, and compared the global hierarchy scores of the 
shuffled connectomes to that of the mouse brain connectome. The 
shuffled connectivity is constructed by randomly rearranging sources 
and targets, while preserving the projection layer patterns and the 
distributions of source and target areas, within each Cre line. We gener- 
ated 100 versions of shuffled connectivity data, and calculated their 
global hierarchy scores as was done with the original connectivity data, 
described in the previous section. The medians of the shuffled distri- 
butions provide an estimate of the lower bound of this score (0.001, 
0.044, -0.002, for CC, CC + TC, CC + TC + CT, respectively; Fig. 6c). 
We also generated connectivity data with perfectly self-consistent 
hierarchy, which provides the upper bound of the global hierarchy 
score. To do this, we assigned a direction (feedforward or feedback) 
for each connection inthe mouse brain connectivity data, based onthe 
final hierarchy positions of the cortical and thalamic regions. With this 
‘true’ mapping of each connection toa direction, the global hierarchy 
score is computed using Eq. (17), producing values of 0.679, 0.636, and 
0.683, respectively, for CC, CC + TC, and CC + TC + CT connections. 
Therefore, comparison of global hierarchy scores allows us to evalu- 
ate how hierarchical the mouse brain is compared to the hierarchy by 
chance (shuffled) and the perfect hierarchy (upper bound). The global 
hierarchy scores with the shuffled mean subtracted and normalized by 
the strictly hierarchical data provides a single measure that quantifies 
the steepness of hierarchy across arbitrarily different connectivity data. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


Data (including high-resolution images, segmentation, registration to 
CCFv3, and automated quantification of injection size, location, and dis- 
tribution across brain structures) are available through the Allen Mouse 
Brain Connectivity Atlas portal (http://connectivity.brain-map.org/). 
Individual experiment summaries can be viewed using this link: http:// 
connectivity.brain-map.org/projection/experiment/[insert experi- 
mental id]. Experimental ids are listed in Supplementary Table 2. In 
addition to visualization and search tools available at this site, users can 
download data using the Allen Brain Atlas API (http://help.brain-map. 
org/display/mouseconnectivity/API) and the Allen Brain Atlas Software 
Development Kit (SDK: http://alleninstitute.github.io/AllenSDK/con- 
nectivity.html). Through the SDK, structure and voxel-level projec- 
tion data are available for download. Examples of code for common 


data requests are provided as part of the Mouse Connectivity Jupyter 
notebook to help users get started with their own analyses. Source 
data generated for this study are provided as Supplementary Tables 
as indicated throughout. Code and data files for hierarchical analyses 
are available through the Allen SDK and Github (https://github.com/ 
AllenInstitute/MouseBrainHierarchy). 
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Extended Data Fig. 1| Similarity of connection strengths by distance, virus, 
hemisphere, and Emx1-IRES-Cre or C57BL/6J mice. a-d, Most experiments 
were done with the Cre-dependent rAAV tracer, rAAV2/1.pCAG.FLEX.EGFP. 
WPRE. A subset of left hemisphere injections had a duplicate injection of rAAV 
witha synaptophysin-EGFP fusion transgene in place of the cytoplasmic EGFP 
(rAAV2/1.pCAG.FLEX.SypEGFP.WPRE). This tracer allowed us to address 
whether labelling presynaptic terminals would improve the accuracy with 
which we could quantify target connection strength, particularly in brain 
regions that contain mostly fibres of passage. Data consisted of n=275 
experiments (137 EGFP, 138 SypEGFP). These were matched across Cre lines and 
areas, and represent n=8 Cre lines andn=26 cortical areas. For pairs of 
spatially matched experiments, the average projection strength (log,o- 
transformed NPV) measured across the entire brain was lower in SypEGFP than 
in EGFP experiments (-0.8 log unit when <500 pm apart). However, brain-wide 
projection values were still highly and significantly correlated. Thus, we 
included the SypEGFP data sets when indicated for analyses of connectivity 
patterns from given source areas (but only in comparison with other SypEGFP 
datasets). a, Spearman correlation coefficients (r) of normalized projection 
volumes for all possible pairs of injections (different and same tracer, allinthe 
same Cre line) plotted against the distance between the injection centroids. 
Linear regressions showed a significant negative slope (P< 0.0001) withr 
decreasing as distance between injections increased. b, r plotted for injections 
within 500 um of each other; slopes were not significantly different from zero 
and means were not significantly different from each other. Average and s.d. 
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for each group is shown by the large symbols on the left (EGFP vs EGFP: 

0.81 + 0.056, SypEGFP vs SypEGFP: 0.79 + 0.064, SypEGFP vs EGFP: 

0.79 + 0.071). c, Quantitative differences in projection strengths measured 
between replicates with the same virus and between SypEGFP and EGFP 
(logNPV(EGFP) - logNPV(SypEGFP) injections, all <500 pm apart inthe same 
Cre line (n= 133 within virus and 222 between virus comparisons). Boxplots 
show median, IQR, minimum and maximum values; + indicates mean. 

d, Maximum intensity projections from four experiments within 500 pm of 
each other illustrate overall similarities between replicate injections and 
tracers (rshown for each pair). Injections targeted primary visual cortex (VISp) 
in Emx1-IRES-Cre mice using either EGFP or SypEGFP tracers as indicated. 
e-g, Injections into Emx1-IRES-Cre mice were made into visual areas on the left 
hemisphere, whereas all C57BL/6J mice received injections into the right 
hemisphere. Following registration to the CCF, whichis asymmetric atlas, we 
identified three pairs of experiments in which the injection centroids were 
<500 pm apart after we flipped injection site coordinates from the left to the 
right. Cortical projections were visually similar across both lines and 
hemispheres, and cortical connectivity strengths (to the 86 cortical targets) 
from these individual experiments (normalized projection volumes) were 
positively and strongly correlated as indicated. Thus, in Fig. 2 we merged the 
Emx1and C57BL/6) data to represent connection strengths from all layers and 
classes, andinsome of the ‘anchor’ groups we used data from both left and 
right hemisphere injections. 
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Extended Data Fig. 2| Characterization of cortical projection neuron classes 
and layer selectivity across mouse lines. a, Brain-wide projection patterns 
were visually inspected for every experiment and manually classified into one 
of six categories on the basis of projections to ipsilateral and contralateral 
cortex, striatum, thalamus, and midbrain, pons or medulla structures as 
described for IT, PT, and CT classes. b-d, Unsupervised hierarchical clustering 
(using Euclidean distance and average linkage) of projection weights validates 
and reveals major classes of cortical projection neurons. b, Each column of the 
heat map shows one of the 1,081injection experiments. Colours in the ‘manual 
PN’ class are coded as inc for projection class. Rows show selected major brain 
regions that distinguish known classes of projection neurons. Valuesin each 
cellare the fractions of total brain projection volume in the given region. The 
dendrogram was split into nine clusters, with two subclusters identified post- 
hoc for cluster 5. The numbers of experiments per cluster were: 1, n=24;2,n=4; 
3,n=204;4,n=158;5a,n=148; 5b, n=230; 6,n=174;7,n=12;8,n=16;9,n=111. 
The numbers of experiments per projection class were: CT, n=119;1T, n=342; IT 
PT, n=158; IT PT CT, n=189; local, n=100; PT, n=173.c, The relative frequency 
of experiments from manually assigned projection classes within each cluster 
is shown. There was significant enrichment of1, or 2 related, classes ineach 
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cluster (dots; Fisher’s exact f-test, P< 0.01).d, Maximum intensity projections 
of GFP-labelled axons across the brain from one example per cluster. 

e, Characterization of layer selectivity in wild-type mice and 14 Crelines 
derived from injection experiments. Number of experiments per lineis listed in 
Supplementary Table 1. For every injection and line, we assessed layer 
selectivity on the basis of the manually annotated injection sites. Polygons 
were drawn around every injection site so that, after registration to the CCF, 
injection volume in each layer could be informatically derived. A layer- 
selectivity index was calculated for each experiment (the fraction of the total 
injection volume contained in each layer, scaled by the relative volume of each 
layer in the injection source region, because layer volumes differ by area). Plots 
show individual data points and the average layer selectivity index + 95% 
confidence intervals (in black) for the set of 15 mouse lines. Red lines in each 
Cre graph show average values from C57BL/6J experiments. Red lines in the 
C57BL/6) graph are averages from the Emx1-IRES-Cre experiments, which also 
labels cells across all layers. There is a bias towards L5 neuron infectionin both 
C57BL/6J and EmxI-IRES-Cre mice, highlighting the importance of using layer- 
selective Cre lines for better coverage of cortical outputs. 
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Extended Data Fig. 3 | Computationally removing the distance dependence 
of connection weights alters the modular structure of the cortex. To test the 
degree to which the spatial proximity of regions affects modularity analysis, we 
used a power law to fit the distance component of our ipsilateral CC 
connectivity matrix”. Then, we repeated our modularity analysis on the 
‘distance-subtracted’ matrix built from these residuals. a, Weighted 
connectivity matrix for 43 cortical areas showing the value of the residuals 
froma power law to fit the distance component. Rows are sources, columns are 
targets. Colours on the rows indicate distance-subtracted community 
structure with varying levels of resolution (y=0.5-1.5 on the y-axis, y=0.8 only 
onthe top portion of the x-axis). Columns are coloured by their module 
affiliation in the distance-subtracted matrix above their module affiliationin 
the original matrix (Fig. le). The inset in the top left corner shows the 
modularity metric (Q) for each level of y, along with the Q value for a shuffled 
network containing the same weights. The Q values for modularity inthe 
distance-subtracted matrix were smaller than for the original cortical matrix 
(for example, 0.2754 versus 0.4638 at y= 0.8) and the range of values for which 


“1 -0.5 Oo. 0.5 
Log10 (residual connection density) 


Qwas greater than Q,nurrieg WaS Narrower (0.7 < y<1.7), but some modules were 
still present inthe distance-subtracted cortical connectivity matrix. The 
difference between Qand Q.hutrieg WaS greatest for y= 0.8. The first distance- 
subtracted module was comprised of the entire somatomotor module, most of 
the lateral module, and two regions from the prefrontal module. The second 
distance-subtracted module contained the visual, auditory, and medial 
modules, plus most of the prefrontal module and one region from the lateral 
module (temporal association area). Notably, these modules were like those 
reported by Rubinovet al.’. As yincreased past 1.0, regions began to split from 
the two large modules in small groups that generally did not reflect the original 
divisions, except for the auditory areas. b, Ipsilateral cortical network in2D 
using a force-directed layout algorithm. Nodes are colour coded by module. 
Edge thickness shows residual values and edges between modules are coloured 
as ablend of the module colours. c, Cortical regions colour-coded by their 
distance-subtracted community affiliation at y= 0.8 show spatial 
relationships. 
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Extended Data Fig. 4 | See next page for caption. 
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Extended Data Fig. 4 | Whole-brain single-neuron reconstructions reveal L4 
IT projections. a, L4 neurons are classified into at least three morphological 
types as shown. b, Image shows sparse labelling of L2/3 and L4 neuronsinthe 
tamoxifen-inducible Cux2-IRES-CreERT2 driver crossed with the Ail66 
reporter and using alow dose of tamoxifen via oral gavage for 1 day. L4 neurons 
were identified on the basis of their apical dendrite and local axons, using 
additional anatomical context when possible. Reconstruction was performed 
using Vaa3D-TeraVR on the high-resolution whole-brain image stack 
(composed of more than10,000 images, resolution x x y x z: 0.3 x 0.3 x lum) 
acquired with atwo-photon fMOST system. c, We identified 25 L4 neurons for 
complete morphological reconstruction of dendrites and axons for three cell 
types and three cortical areas. In this Cre line at least, spiny stellate cells (SSCs) 
were most frequently identified. d, Dorsal surface view shows the CC 
projection patterns from three anterograde tracer experiments into the 


predominantly L4 Cre lines for somatosensory cortex (SSp-m), visual cortex 
(VISp) and auditory cortex (AUD). e-k, Each panel shows two examples of 
reconstructed cells of the same L4 type in somatosensory, visual or auditory 
cortex. Local morphology for each cell is shown in the inset. Arrowheads 
indicate axon clusters outside local region. Red, axon; blue, basal dendrite; 
black, apical dendrite. Consistent with canonical descriptions, we found SSCs 
inthe somatosensory cortex that had only local axon clusters (e). However, 
even in these cases, we frequently observed what appeared to be an aborted 
axon branch (no terminal cluster found; long arrow). We also found SSCs in 
somatosensory cortex that did have clear axon clusters in nearby areas (g), and, 
in auditory cortex, SSCs projected even to the opposite hemisphere (f). 

h-k, Although we identified fewer tufted pyramidal (TPC) and untufted 
pyramidal (UPC) cell types inthis experiment, for both types we still found cells 
with near and long-range projections. 
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Extended Data Fig. 5 | Locations and cortical projection patterns from 
thalamic tracer experiments. a, Locations of the thalamic tracer injection 
centroids (blue dots) mapped onto virtual 2D coronal planes from the Allen 
CCFv3. To minimize the number of sections shown, all centroids are mapped 
within 200 pm of their original location. See Supplementary Table 1 (thalamus 
tab) for more details on Cre lines and coverage. b, Example TC projections are 
shown ina flat map view of the ipsilateral cortical hemisphere for different 
thalamic nuclei arranged by the clusters identified in Fig. 3 and related to 
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cortical modules. Most thalamic clusters projected primarily toa single 
module (Fig. 3c), but some thalamic regions projected across multiple modules 
(for example, anteroventral nucleus (AV), ventral anterior-lateral complex 
(VAL), parafascicular nucleus (PF), and central lateral nucleus (CL)), or 
projected strongly to both prefrontal and another module; for example, 
somatomotor (mediodorsal nucleus (MD)-1, ventral medial nucleus (VM)), 
lateral (paraventricular nucleus (PVT), MD-2, parataenial nucleus (PT)) or 
medial regions (nucleus of reuniens (RE), anteromedial nucleus (AM)). 
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Extended Data Fig. 6 | Comparison of corticothalamic projection strengths 
derived from EGFP and SypEGFP tracer experiments. a—d, Maximum 
intensity projections from four experiments within 500 pm of each other 
targeting VISp (same experiment labelled VISp-3 below) using either EGFP or 
SypEGFP tracers in the Rbp4-Cre_KL100 (LS) or Ntsr1_Cre_GN220 (L6) line as 
indicated. a’-d’, Coronal STPT images near the centre of the densest terminal 
zone inLGd show axon and presynaptic terminal labelling in LGd and other 
thalamic targets, including the ventral lateral geniculate (LGd, LGv), the 
intergeniculate leaflet (IGL) and the lateral posterior nucleus (LP). The anterior 
pretectal nucleus (APN) inthe midbrain is also indicated. SypEGFP labelling is 
more punctate and has less fluorescence in axons and fibre tracts.a’”-d”, 
Coronal STPT images near the centre of one of the densest terminal zones in 
the middle of LP.a’”’”-d””, Coronal STPT images near the centre of the second 
densest terminal zone in the anterior part of LP. This image also containsa 
portion of the terminal zone in the lateral dorsal nucleus (LD). e-h, Directed, 
weighted, connectivity matrices (11 x 44) showing log,o-transformed 
normalized projection volumes for the Cre lines representing CT projections 
labelled from layers 5 (e, f) or 6 (g, h) with EGFP or SypEGFP tracer as indicated. 
True negatives (including passing fibres) at the regional level were masked and 
coloured dark grey. The colour map is the same as in Fig. 4. The matrix shows 


relative differences for connections originating from L5 versus L6 (L5— L6/ 
L5+L6) for EGFP-based measures (i) and SypEGFP-based measures (j). 

k, Normalized projection strengths for CT targets (n=484) were significantly 
correlated from matched cortical locations between EGFP and SypEGFP tracers 
for both Cre lines (Spearman r=0.71, 0.73; P< 0.0001). On average, EGFP CT 
NPVs were -0.5 log unit larger than SypEGFP for Rbp4 experiments, but were 
not different for the Ntsr1 line. 1, Normalized projection strengths for CT 
targets (n=484) contacted by L5 or L6 cortical neurons in matched injection 
locations were also significantly correlated for both EGFP and SypEGFP tracers 
(Spearman r= 0.51, 0.60; P< 0.0001), although more weakly than for the same 
line between viruses. Specific connections with different fibre to terminal 
ratios are coloured by source module (light blue, from VISp; orange, from SSp; 
dark blue, from RSPag]). m, Relative differences in projection strengthto LP 
and LGdare plotted from n=6 VISp injection experiments (VISp-1to VISp-6in 
matrix rows above) for each Cre line and viral tracer. n, Relative difference 
ratios calculated for L5 to L6 using EGFP are plotted against those obtained 
using SypEGFP (n=484 CT connections, n=278 above threshold). Thereisa 
significant correlation (Spearman r=0.68, P< 0.0001). Specific connections 
are coloured by source module (from]) and labelled with the target. 
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Extended Data Fig. 7 | Validation of informatics-processing steps: CCF 
registration and quantification from segmentation. a—c, To determine the 
precision of the registration process on which we rely here for quantification of 
signal by layer in the cortex, we manually delineated layers 1 to 6b, using 
background fluorescence in coronal STPT images, for n=9 cortical areas 
(ACAd, ORBvI, Ald, PERI, SSp-bfd, MOp, VISp, RSPd, and AUDp; see 
Supplementary Table 3) inn =4 mice per region. We then quantified the 
percentage of voxels within each manually annotated layer that were assigned 
toall cortical layers following automated registration to the CCFv3.a,A 
confusion matrix show the mean percentage of overlapping voxel labels 
averaged across these areas (individual region datain Supplementary Table 7). 
b,c, Boxplots show the median and mean (indicated with +); whiskers show the 
minimum-maximum range for the percentage overlap for individual 
experiments (b) or cortical areas (c, coloured dots). Across these cortical areas, 
the average percentage overlap ranged from 86 to 96% of voxels appropriately 
registered for all layers, except for L6b, which was not included in subsequent 
layer quantifications. For some areas and layers, the precision was worse than 
others; for example, while 66% of voxels were appropriately assigned to L2/3in 
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ACAd, the remaining 34% were assigned to neighbouring LS. In ORBvIl, only 51% 
of voxels were appropriately labelled for L6a. Note, however, that delineating 
layer 5 from L6ain ORBvlin coronal sections using just background 
fluorescence was very difficult even for experienced anatomists, so some of 
the imprecision may in fact come from the manual drawing. Even with these 
exceptions noted, inall cases a large majority of voxels were registered and 
assigned correctly. d, e, Frequency distributions of informatically derived 
quantification for manually verified true negative and positive targets. d, The 
numbers of log,)-transformed normalized projection values are plotted for all 
CC and TC targets manually verified as true negative (n = 24,272) or true 
positive (n =12,921). Most true positive values were between log,,=—4 and 
logy) =1. At log,,)=—1.5 (red arrow), 639 true negatives remained (2.6%), while 
7,100 true positives were still included (54.9%), resulting in a false positive rate 
of 8.3% at this threshold level. e, Numbers of log,)-transformed normalized 
projection values plotted for all CC and TC targets manually verified as true 
negative (n=15,789) or true positive (n= 4,503). At log,,=—2.5 (red arrow), 362 
true negatives remained (2.3%), while 3,335 true positives were still included 
(74.1%), resulting ina false positive rate of 9.8% at this threshold level. 
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Extended Data Fig. 8 | See next page for caption. 
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Extended Data Fig. 8 | CC projection patterns by layer and class between 
reciprocally connected areas with known hierarchy. a, In the visual module, 
VISp and VISal (see Supplementary Table 3) are reciprocally connected (black 
line). VISp is the de facto bottom of visual cortex hierarchy. The output to VISal 
from VISp is feedforward (FF). The reciprocal connection (VISal to VISp) is 
feedback (FB). In the FF direction (top), VISp projections fromL2/3,L4, and L5 
IT projections were densest in L2/3-L5 of VISal, and relatively sparse in Ll and 
L6 (cluster 4). Rbp4 projections from VISp to VISal were densest in L4 andL6, 
with moderate levels inL2/3 (cluster 8).L5 PT and L6 CT cells projected, albeit 
sparsely, toL1and LS (cluster 2). Inthe FB direction (bottom), L2/3IT axons 
were broadly distributed across layers, witha sparser region inL5 (cluster 6). 
VISal L4 IT cells projected noticeably more weakly to VISp (as opposed to the 
panel above), and terminated witha different pattern (Lland L5/6, cluster 6).L5 
IT cells projected densely to superficial layers in VISp (cluster 1). Rop4.axons 
were dense inLland deep layers (cluster 6). Projections from L5 PT and L6 CT 
cells were also sparse, but present in L1and L6 (cluster 6). b, Inthe 


somatomotor module, SSp-bfd and SSs cortex are reciprocally connected. SSp- 


bfd to SSs is FF; the reverse is FB. Inthe FF direction (top), L2/3 and L4 IT cells 
preferentially innervate L2/3-L5, with relatively fewer terminalsinLlandL6 
(clusters 3 and 4). L5IT projections densely innervate L1 and L2/3 (cluster 1). 
Rbp4 projections were densest in L4 and L6, with moderate levels in L2/3 


(cluster 8).LS PT andL6 CT cell projections were sparse, andtoLland/or deep 
layers (cluster 2 and 6). In the FB direction (bottom), the patterns looked 
remarkably like FB projections from VISal to VISp. Note again the strong 
connection originating from L4 cells only in the FF direction. c, VISp (inthe 
visual module) and ACAd (in the prefrontal module) are reciprocally 
connected. ACAd exerts top-down control of VISp activity (FB); the reverse 
(VISp to ACAd) is considered FF. Inthe FF direction (top), L2/3,L4,and LS cells 
all preferentially innervate L1 (cluster 1). In the FB direction (bottom), L2/3 cells 
also predominantly terminate in L1, but LS cells project to bothLland deep 
layers (L5 and L6, cluster 6). Note also there is a potentially significant sub-layer 
distinction; axons from VISp to ACAd are relatively deeper in L1 (or at the 
border of LlandL2/3) of ACAd, compared tothe more superficial termination 
of ACAd axons inL1 of VISp. All panels: overall, FF projections are more oftenin 
clusters 1, 4, and 8, and FB projections in cluster 6. Cluster assignments are 
indicated in each panel; n/a indicates that the connection was either absent or 
below threshold for clustering. Areas in each module are shown ina top down 
cortex view and the network as a force-directed layout 

(edges denote normalized connection density from Fig. le). STPT images inthe 
approximate centre of each target region show the laminar distribution of 
axons arising from labelled neurons in the different Cre lines. Images are 
rotated so that the pial surface is always at the top of each panel. 


a LGd to VISp (FF) VISp to LGd (FB VPL to SSp-ll (FF 
S cluster 4 


Core 
(FF) 
(q4) soyejnpow 15 


TC_driver 


ak pa 
Sa! 


ILA to PT (FB MD to Ald (FF Ald to MD (FB 


b 
fo) rom'L5 fa { " fi ( cluster 7 
rs : : 
= 3 
8 =~ 2 
& LE < 
v — i) 
Ra} a o 
= o) s 
© 2 = 
= 5 a 
oO 
i 
c LP to VISam (FF) 
cluster 6 ‘ cluster 8 cluster 5 
@) 
~ mo 
fa S 
a = 
L ® 
fo) Ss 
& Ti 
| 7 
> L 
fe) 
E 
e PO to MOs (FB) 
cluster 2 
mo 
i i 
g 3 
=< ae a 
S T. < 
= ny 
i | ee, - . 
£ o 8 RE to VISp RE to ORBI (FB) 
T > ~ " 
Raj = | cluster 6 ‘ 
See : 
= e q 
i 
os 
a \y 
= S$ VM to SSp-ll (FB VM to ORBI (FB ORBI to VM (FB VM to MOs (FB MOs to VM (FB s 
5 3 cluster.6 ao a4 § 7 . fronmL5 
s|—: ~ oe | ae 
Ss 3 
3 fe) 
fe) 2 
= = 
| 2 
oO ° 
ke ba | 


Extended Data Fig. 9 | See next page for caption. 
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Extended Data Fig. 9 | TC and CT projection patterns and rules between 
reciprocally connected areas. a, Schematic summarizes observed projection 
patterns between core thalamic nuclei (blue circle) and their reciprocally 
connected cortical targets (LI-L6 colour coded). Laminar patterns are from 
Fig. 5g. STPT images of labelled axon terminals between three pairs of core 
nuclei and primary sensory cortex that perfectly followrulesin both 
directions. Inthe FF direction (LGd to VISp, VPL to SSp-ll, VPM to SSp-n), 
projections are dense in L4 or L4 and L6 (clusters 4, 8). Inthe FB direction, CT 
projections predominantly arise from L6. b, Schematic summarizes observed 
projection patterns between matrix-focal thalamic nuclei (orange circle) and 
their reciprocally connected cortical targets. STPT images of reciprocal 
connections between PT and ILA, MD and ORBI, and MD and Ald illustrate the 
schematized rules. Projections from these thalamic nuclei belong to clusters 
with relatively fewer L1 axons (FF-like, clusters 3, 7, 9). The reciprocal CT input is 
also stronger from L6 (FB), like the core nuclei above. c, Three schematics are 
shown to summarize observed projection patterns between matrix-multiareal 
thalamic nuclei (red circles) and their reciprocally connected cortical targets. 
The top schematic shows dense TC projections to L1 (FB) with CT projections 
originating from L5 (FF). The middle schematic (with relevant example images 
boxed) shows reciprocal connection patterns in which TC projections target 
mid-layers (FF-like) and the reciprocal CT input is stronger from L6 (FB). The 
bottom schematic shows the same TC projection pattern as the top schematic, 


but with CT projections originating approximately equally fromLS and L6. 
STPT images show reciprocal connections between multiarea-matrix thalamic 
regions LP, PO, RE, and VMtothree cortical targets each. Some regions have 
target-specific projections that are either FF or FB. For example, different from 
the LP-to-VISp projection (FB), axons from LP to VISam and ACAd target mid- 
layers as opposed toL1 (clusters 8 and 5, FF), and the reciprocal connection 
arises more from L6 (typical for FB). Projections from PO, RE, and VM toall 
three cortical targets are consistent witha FB projection (denser terminations 
in Lland either LS or L6 (clusters 2 and 6). Reciprocal CT projections originate 
from LS or, bothL5 and L6. We did not see CT input arising equally from both 
layers or more fromL5 when the reciprocal TC projection was considered FF, 
consistent with the ‘no-strong-loops’ hypothesis”. All panels: overall, FF 
projections from core thalamic regions are in clusters 4 and 8. FB projections 
from matrix-multiareal thalamic regions are in clusters 2 and 6, like CC FB. The 
matrix-focal results support the notion that patterns with relatively less L1 
involvement (3, 5, 7, 9) are FF, particularly given the strong reciprocal input 
observed fromL6. STPT images are from the approximate centre of the axon 
termination field for each target region. Cortex images were rotated so that the 
pial surface is at the top. Cluster assignments (for TC) are indicated in each 
panel. Text labels above image show FF and FB direction based on relative 
position in Fig. 6. Dashed lines indicate region borders. 
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Extended Data Fig. 10 | Robustness of the hierarchical organization results. 
We constructed multiple hierarchies using only C57BL6/J and Emx1-IRES-Cre 
experiments (WT) or Cre data without the Cre line confidence measure to 
compare with results in Fig. 6. The hierarchical position of each area H?and the 
CC global hierarchy score h,, are defined asin Eqs. (4,5) in Methods, but with 
the same confidence for all lines, that is, conf(7) =1 for all Cre lines (7). a, b, In 
both cases, connection types 2 and 6are assigned to one direction (feedback), 
while other clusters are grouped to the opposite direction (feedforward). 
Cluster 7 was not identified in the WT dataset. c, CT connections were also 
classified as in Fig. 6b for the Cre data. CT connections were not included for 
WT as these are exclusively defined by Cre lines. d, e, Global hierarchy scores 
from the original, observed data, and the distributions of hierarchy scores 
obtained from shuffled data sets (n=100) are shown for CC connections only 
(green), compared to scores obtained when TC and CT connections are 
sequentially included (pink, blue). The upper bound scores for an artificially 
perfect hierarchy using the WT datasets (e) are 0.630 for CC and 0.601 for 
CC+TCconnections. f, z-scores were calculated for the global hierarchy scores 
compared to shuffled data for each of the three versions of cortical hierarchy 
(CC,CC+TC,CC+TC+CT). The highest z-scores were observed when using Cre 
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line confidence weighting (compared to those with no confidence weighting or 
wild type data only). g, Predicted hierarchical positions of 37 cortical and 24 
thalamic areas based on CC, CC+TC, or CC +TC+CT connections. Areas are 
ordered in each panel by the scores obtained using Cre line data with 
confidence weighting (Cre conf, black circles). Scores from Cre line data 
without confidence weighting (grey circles) and scores from wild type/Emx1- 
IRES-Cre data (open circles) are plotted for direct comparison. y-axis labels are 
colour coded by module assignment (for cortical areas). h, Robustness of the 
cortical hierarchy (w/ Cre conf) against individual Cre lines and projection 
classes. Left, Spearman rank correlation coefficients between the CC and 
CC+TC hierarchy withn =13 layer- or class-specific Cre lines included versus 
each of the Cre lines removed. Right, results when data from Cre lines with the 
same layer and class were removed together. Removal of these lines and classes 
produced relatively minor deviations from the overall hierarchy determined 
with all data. Note that in both panels the y-axis starts at r= 0.85. For alllines 
and classes, the correlation with the hierarchy using the complete data set is 
very high. The lowest correlations occurred following removal of Cux2-IRES- 
Cre, Rbp4-Cre_KL100, and Tlx3-Cre_PL56. 
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The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 
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Data collection Serial 2 photon images were processed using the Allen informatics data pipeline (IDP), which manages the processing and organization of 
the images and quantified data for analysis and display in the web application as previously described (Oh et al., 2014 and Kuan et al. 
2015). IIlumination and image acquisition for intrinsic signal imaging were controlled with an in-house GUI software written in Python. 
For single cell morphologies, following acquisition of the complete fMOST image stack, it was converted to a multi-level navigable dataset 
using the open source Vaa3D-TeraFly program then reconstructions were performed using Vaa3D-TeraVR software tools built to facilitate 
semi-automated and manual reconstructions (Bria et al., 2016 and Wang et al., 2019). 


Data analysis Unsupervised hierarchical clustering was conducted with the online software, Morpheus, (https://software.broadinstitute.org/ 
morpheus/) for algorithms and for visualization of the dendrogram and heat maps. The software program GraphPad Prism was used for 
statistical tests and generation of all graphs, and the software program Gephi was used for visualization and layout of network diagrams. 
The software program Vaa3D was used for visualization of single cell morphologies. Hierarchy analyses were performed as described in 
detail in Methods. Code and data files for hierarchical analyses are available through the Allen SDK and Github (https://github.com/ 
Alleninstitute/MouseBrainHierarchy). 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers 
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All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 


- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- A description of any restrictions on data availability 


Data (including high resolution images, segmentation, registration to CCFv3, and automated quantification of injection size, location, and distribution across brain 
structures) are available through the Allen Mouse Brain Connectivity Atlas portal (http://connectivity.brain-map.org/). Individual experiment summaries can be 
viewed using this link:http://connectivity.brain-map.org/projection/experiment/[insert experimental id]. In addition to visualization and search tools available at this 
site, users can download data using the Allen Brain Atlas API (http://help.brain-map.org/display/mouseconnectivity/API) and the Allen Brain Atlas Software 
Development Kit (SDK: http://alleninstitute.github.io/AllenSDK/connectivity.html). Through the SDK, structure and voxel-level projection data are available for 
download. Examples of code for common data requests are provided as part of the Mouse Connectivity Jupyter notebook to help users get started with their 
analyses. Our code for hierarchical analyses is also available through the Allen SDK and Github (https://github.com/AllenInstitute/MouseBrainHierarchy). 
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Sample size In our previously published study (Oh et al., 2014), we demonstrated that an n=1 is a good predictor of connectivity strengths across multiple 
animals. Here, we also found that the correlations between brain-wide projection strengths from experiments at matched locations are 
positive and significant (r>0.8, Extended Data Figure 1). Thus, we are able to confidently and comprehensively sample across the entire cortex 
(and entire brain) with n=1 experiment per source area and Cre line. This consistency is what allows us to map the connectome, which still 
required 1,000 experiments (i.e. mice) for the coverage we determined was necessary for close to completeness. 


Data exclusions _ Image series for each tracer experiment were curated for inclusion based on pre-established QC metrics, and quantitative data had to pass 
the threshold criteria described in the Results. 


Replication As stated above, in our previously published study (Oh et al., 2014), we demonstrated that an n=1 is a good predictor of connectivity 
strengths across multiple animals. Here, we also found that the correlations between brain-wide projection strengths from experiments at 
matched locations are positive and significant (r>0.8, Extended Data Figure 1). Thus, we are able to confidently and comprehensively sample 
across the entire cortex (and entire brain) with n=1 experiment per source area and Cre line. This consistency is what allows us to map the 
connectome, which still required >1,000 experiments (i.e. mice) for the coverage we determined was necessary for close to completeness. 


Randomization Randomization of animals to different groups is not relevant to our study design. We did not have experimental vs. control groups. 


Blinding Image data acquisition and quantitative measures of projection strengths are automated, so blinding was not necessary. Investigators 
performing manual target analyses were not blinded to injection source or cre line, but this was impossible as the annotation required an 


anatomy expert to look through every single image. The location would be thus known, and the layer of origin of the cells thus obvious as well. 
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Obtaining unique materials _ Viral tracers are available through Addgene, the Penn Vector Core, or via request to the Allen Institute. Cre driver lines are 
available from the repositories indicated in Supplementary Table 1. 


Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals Mus musculus, C57BI/6J, males and females, Cre driver transgenics, P56 (+/-7 days) 
Wild animals This study did not involve wild animals. 
Field-collected samples This study did not involve field-collected samples. 
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Accumulation of mutant proteins is a major cause of many diseases (collectively 
called proteopathies), and lowering the level of these proteins can be useful for 
treatment of these diseases. We hypothesized that compounds that interact with both 
the autophagosome protein microtubule-associated protein 1A/1B light chain 3 (LC3)' 
and the disease-causing protein may target the latter for autophagic clearance. 
Mutant huntingtin protein (mHTT) contains an expanded polyglutamine (polyQ) 
tract and causes Huntington’s disease, an incurable neurodegenerative disorder’. 
Here, using small-molecule-microarray-based screening, we identified four 
compounds that interact with both LC3 and mHTT, but not with the wild-type HTT 
protein. Some of these compounds targeted mMHTT to autophagosomes, reduced 
mHTT levelsin an allele-selective manner, and rescued disease-relevant phenotypes 
in cells and in vivo in fly and mouse models of Huntington’s disease. We further show 
that these compounds interact with the expanded polyQ stretch and could lower the 
level of mutant ataxin-3 (ATXN3), another disease-causing protein with an expanded 
polyQ tract’. This study presents candidate compounds for lowering mHTT and 
potentially other disease-causing proteins with polyQ expansions, demonstrating 

the concept of lowering levels of disease-causing proteins using autophagosome- 
tethering compounds. 


Lowering the levels of disease-causing proteins, especially those with 
unknown activities, is an emerging approach for disease treatment. 
Biological tools such as RNA-mediated inhibition (RNAi) or CRISPR may 
achieve this goal* ©, but their clinical delivery is challenging. Enhancing 
proteasomal degradation of target proteins using proteolysis-targeting 
chimeric molecules (PROTACS) is a promising emerging approach’, but 
proteasomes alone are inefficient in degrading certain large proteins or 
aggregates®. Macroautophagy (hereafter referred to as autophagy), an 
independent protein-degradation pathway, is a bulk degradation system 
that engulfs proteins into autophagosomes for subsequent lysosomal 
degradation’. Autophagy is present in all eukaryotic cells, and therefore 
harnessing the power of autophagy to degrade certain target proteins 
may have potential for drug discovery. Here we investigate this possibility 
inthe context of lowering mHTT, which contains a polyQ stretch with at 
least 36 glutamine residues and causes Huntington’s disease, anincurable 
monogenetic neurodegenerative disorder’. 

mHTT could be degraded by autophagy, during which protein sub- 
strates are incorporated into double-membrane autophagosomes 
associated with lipidated LC3'. We therefore hypothesized that linker 


compounds that interact with both mHTT and LC3 may tether the mole- 
cules together to enhance recruitment of mHTT into autophagosomes, 
facilitating its degradation. In addition, mHTT-LC3 linker compounds 
that do not interact with wild-type HTT (wtHTT) may promote allele- 
selective degradation of mHTT. Because no mHTT-LC3-interacting 
compounds have been reported, we performed small-molecule-micro- 
array (SMM)-based screening for such compounds and used wtHTT for 
the counter-screen to identify allele-selective candidates. 


Results 

Identification of MHTT-LC3 linker compounds 

We stamped 3,375 compounds (Fig. 1a) in duplicate onto a microarray 
onisocyanate-functionalized glass slides using the nucleophile-isocy- 
anate reaction, which forms covalent bonds between the compounds 
and the glass slides’°”. We then purified the human LC3B protein’ 
(Extended Data Fig. 1a, b, Supplementary Table 1), a pathogenic mHTT 
exonl fragment” with an expanded polyQ region containing 72 glu- 
tamines (mHTTexon1(Q72)), and a control wtHTT exon! fragment 
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Peninsula Schools of Medicine and Dentistry, Institute of Translational and Stratified Medicine, University of Plymouth, Plymouth, UK. ‘Laboratory of Cellular Neurobiology, Department of 
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Fig. 1| Identification of potential mHTT-LC3 linker compounds by SMM- 
based screening and validation. a, OI-RD image of aSMM. Each compound 
was printed in duplicate in adjacent vertical positions. b, Magnified view of 
surface mass-density changes after incubation with HT Texon1(Q25)-MBP, 
HTTexon1(Q72)-MBP or LC3B. The red outlines highlight two hits (10OS and 
8F20). c-e, Association-dissociation curves of surface-immobilized 
compounds 8F20 and 1005 with HT Texon1(Q72)-MBP (Q72) (c), LC3B (d) and 


(HTTexon1(Q25)) (Extended Data Fig. Ic, d) for the screen. We fused 
a maltose-binding protein (MBP) tag to both HTT exon! proteins to 
increase their solubility for subsequent experiments. 

To identify compounds that interact with LC3B and mHTT, we 
incubated these proteins over the SMMs and detected compound- 
protein interactions using a scanning oblique-incidence reflectivity 
difference (OI-RD) microscope“, an optical biosensor. We then per- 
formed experiments with HT Texon1(Q25) or buffer alone to exclude 
nonspecific signals, and identified two compounds, 1005 (GW5074, 
3-3-((3,5-dibromo-4-hydroxyphenyl)methylidene)-5-iodo-1H-indol- 
2-one) and 8F20 (ispinesib, N-(3-aminopropyl)-N-((1R)-1-(7-chloro- 
4-0xo-3-(phenylmethyl)-2-quinazolinyl)-2-methylpropyl)-4-methylb- 
enzamide), that interact with both LC3B and mHT Texon1(Q72), but 
not with HTTexon1(Q25) (Fig. 1b, annotation based on the ID in the 
compound library). We then measured the on and off rates (K,, and Korr, 
respectively) of these interactions to confirm our observation (Fig. 1c, 
d), finding that both compounds showed dissociation constants (K,) of 
around 100 nM with LC3B or mHT Texon1(Q72). As shown in Fig. le, these 
compounds also interacted with the full-length mHTT (flHTT(Q73), 
Extended Data Fig. le), but not with wild-type HTT (HT Texon1(Q25) 
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full-length HTT(Q73) (Q73) (e) at the indicated purified protein concentrations. 
In association-dissociation curves, vertical dashed lines mark the starts of 
association and dissociation phases of the binding event. The red dashed 
curves are global fits toa Langmuir reaction model with the fitting parameters 
listed at the bottom of each plot. No binding signals were observed for 

HT Texon1(Q25)-MBP or full-length HT T(Q23) proteins, and thus these 
parameters are not presented. 


or flHTT(Q23), Fig. Ic, e) or irrelevant proteins (Extended Data Fig. 2a) 
including MBP-His, (MBP), superfolder GFP (sfGFP) and Rpni0 (a pro- 
teasomal ubiquitin receptor) (Extended Data Fig. 1f). We then validated 
the interaction using an orthogonal assay, microscale thermophoresis 
(MST), and obtained consistent results (Extended Data Fig. 2b). 


Linkers induced allele-selective mHTT lowering 


We then tested whether these potential mMHTT-LC3 linker compounds 
decrease mMHTT levels via autophagy as predicted. Both compounds 
decreased levels of mMHTT in cultured primary cortical neurons from 
a well-established HD-knock-in mouse model (Hdh?72"°)” (Fig. 2a), 
but had little or no effect on levels of wtHTT in the heterozygous HD 
neurons (Hdh?2"°) (Fig. 2a) or wild-type neurons (Hdh2”’ (Fig. 2b), 
consistent with the lack of interaction of these compounds with wtHTT. 
We then screened for other mHTT-LC3 linker compounds onthe basis 
of common features of the two hit compounds 1005 and 8F20. The 
hydroxyl group in1005 and the amino group in 8F20 were used in the 
nucleophile-isocyanate reaction for stamping of the SMMs, and these 
groups were inaccessible to mHTT and LC3B for the compound-protein 
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Fig. 2| mHTT-LC3 linker compounds lower mHTT but not wtHTT in cultured 
mouse neurons via autophagy. a, Western blot (HTT detected by the 2166 
antibody) and quantification of compound-treated cultured cortical neurons 
from Hdh?’2° HD-knock-in mice. Two-way ANOVA. For 1005, F(1, 72) = 50.93, 
P<0.0001; for 8F20, F(1, 40) =8.903, P=0.0048. b, Representative western 
blots (from three biological repeats) of cultured wild-type cortical neurons 
treated with the indicated compounds. c, Two-dimensional structures of the 
hit compounds and the other identified effective linker compounds. The red 


interaction during the screening (Fig. 2c). Thus, whereas the two hit 
compounds have different structures, the exposed chemical groups on 
the SMMsshare similarities in that they contain an aryl ring connected 
to alactam-based bicyclic structure with halogen-substituted aryl 
group (Fig. 2c). We tested several compounds with similar features and 
identified two additional mHTT-LC3 linker compounds (Fig. 2c, AN1 
(3-5-bromo-3-((3-bromo-4,5-dihydroxyphenyl)methylidene)-1H-indol- 
2-one) and AN2 (5,7-dihydroxy-4-phenylcoumarin), which interact with 
both mHTT and LC3B but not with wtHTT or irrelevant control proteins 
(Extended Data Fig. 2c, d). They also reduced the levels of mMHTT in 
an allele-selective manner in cultured HD mouse neurons (Fig. 2d). 
No cytotoxicity was observed in cultured neurons treated with these 
compoundsat the tested concentration range (Extended Data Fig. 2e), 
confirming that the reduction in mMHTT was not due to cell loss. 

Most of these compounds showed an optimal dose (hook effect) in 
lowering mHTT (Fig. 2a, d): a sufficient concentration is desired for 
tethering mHTT and LC3 together, but excessively high concentrations 
may cause the compound molecules to interact with mHTT and LC3 
separately without tethering them. Similar concentration-dependent 
effects were observed in fibroblasts of patients with HD (Fig. 3c, right) 
and have been reported for PROTAC”. Consistent with the prediction 
that the reduction in mHTT is mediated by degradation via autophagy, 
the autophagy inhibitor NH,Cl or chloroquine blocked the mHTT-low- 


AN2 concentration (nM) 


lines indicate the glass chip surface on which the compounds are immobilized. 
The dotted ovals indicate the possible chemical groups exposed for protein- 
compound interactions inthe screening. d, Left and middle, as ina, but using 
AN1 or AN2. For AN1, F(1, 70) =32.96, P< 0.0001; for AN2, F(1, 69) = 23.03, 
P<0.0001. Right, as ina, but blotted with indicated HTT antibodies (1005 and 
8F20:100 nM; AN2:50 nM). For all panels, n indicates the number of 
independently plated wells; data are mean +s.e.m. Full blots of cropped gels are 
shown in Extended Data Fig. 3b or Supplementary Fig. 1. 


ering effects (Extended Data Fig. 3a), confirming that the compounds 
targeted mHTT for autophagic degradation. Further, the compound- 
induced mHTT-lowering effects were only slightly enhanced by the 
mTOR inhibitor rapamycin, an enhancer of autophagosome formation 
(Extended Data Fig. 3a, right; also see Fig. 3b). 

The reduction of mHTT levels could be detected by multiple 
mHTT antibodies—including 3B5H10, which detects a toxic species 
of the expanded polyQ stretch” (Fig. 2d, right)—suggesting that 
the detected reduction of the mHTT signal was not due to changes 
in affinity to a specific antibody. In addition, we did not observe any 
obvious increase of possible polyQ-containing mHTT fragments at 
lower molecular weights (Extended Data Fig. 3b, c), suggesting that 
the reduced levels of mHTT were nota result of increased site-specific 
cleavages of mHTT. 

We further investigated the effects of the compounds in cells from 
patients with HD using the well-established homologous time-resolved 
fluorescence (HTRF) assay”*™, which is more quantitative than west- 
ern blots but is not applicable to mouse mHTT proteins owing to 
non-specific signals®. We observed autophagy-dependent lowering 
of mHTT by these compounds in fibroblasts from patients with HD 
and neurons derived from induced pluripotent stem cells (iPS cells) 
(Fig. 3a, b, Extended Data Fig. 3d), but no lowering of wtHTT in fibro- 
blasts from healthy human donors or patients with Parkinson’s disease 
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Fig. 3 | mHTT-LC3 linker compounds lower mHTT in cells from patients with 
HD.a, HTT levels measured by HTRF (2B7/MW1 for mHTT, and 2B7/2166 for 
total HTT) in primary fibroblasts from patients with HD, wild-type controls 
(WT) or patients with Parkinson’s disease (PD) who were treated with the 
indicated compounds (100 nM). All signals were normalized to the average 
signals from the DMSO control group. One-way ANOVA with post hoc Dunnett’s 
tests. ****P< 0.0001. b, Asina, but using immortalized fibroblasts treated with 
or without the autophagy inhibitors NH,Cl, chloroquine or bafA1, or the 


(Fig. 3a). To further confirm the role of autophagic degradation, we 
tested the effects of the compounds with or without lowering of ATGS, 
a key autophagy gene that is required for autophagosome formation”®. 
ATGS knockdown in fibroblasts from a patient with HD (expressing 
mHTT(Q47)) significantly decreased LC3-II levels and nullified the 
mHTT-lowering effects induced by the mHTT-LC3 linker compounds 
(Extended Data Fig. 3e). Similar results were obtained in ATGS-knockout 
mouse embryonic fibroblasts” (MEFs, Extended Data Fig. 3f), confirm- 
ing that the effects of the compounds were mediated by autophagic 
degradation. 

The two hit compounds 1005 and 8F20 are known to inhibit c-Raf and 
KSP?’?8 respectively, whereas ANI and AN2 had unknown activities on 
these targets. We therefore tested their potential influence onc-Raf and 
KSP. On the basis of the in vitro c-Raf kinase assay, 1005 (aknownc-Raf 
inhibitor)—but not the other three compounds—inhibited c-Raf at the 
concentrations tested (Extended Data Fig. 4a). We then tested MEK and 
ERK phosphorylation levels in the cultured neurons treated with these 
compounds at optimal mMHTT-lowering concentrations to evaluate 
Raf activity”’ and found no significant effects of all tested compounds 
(Extended Data Fig. 4b, left). We also tested phospho-BUBRI levels 
to evaluate KSP activity”, and again observed no significant effects 
(Extended Data Fig. 4b, right). We made similar observations in fibro- 
blasts (expressing MHTT(Q47)) froma patient with HD (Extended Data 
Fig. 4c). Thus, the observed reduction in MHTT is probably irrelevant 
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autophagy enhancer rapamycin. c, Left, asin a, but using immortalized 
fibroblasts from patients with HD (expressing mHTT(Q47)), treated with 
indicated c-Raf or KSP inhibitors at 100 nM. Middle, 2D structure of the 
inhibitors. The dotted ovals indicate the parts of the compounds that share 
similarities with the hit compounds. Right, dose-response curves of the 
indicated compounds. For all panels, n indicates the number of independently 
plated wells; dataare mean+s.e.m. 


to c-Raf or KSP inhibition. To further confirm this, we examined the 
effects of several known c-Raf or KSP inhibitors, and found that they 
had no HTT-lowering effects (Fig. 3c, left). Two of these inhibitors, 
PLX-4720 and BAY12173839, have structures similar to 1005 and 8F20, 
respectively (Fig. 3c, middle). These compounds did not lower mMHTT 
in cells from patients at sub-micromolar concentrations (Fig. 3c, right), 
probably because they had very weak affinity to LC3 and mHTT, if any 
(Extended Data Fig. 2c, right). By contrast, AN2 reduced mHTT levels 
inthe same cells ina dose-dependent manner (Fig. 3c, right). 

We then investigated the effects of the compounds in vivo. Because 
the Drosophila LC3 homologue Atg8 has a predicted structure that is 
highly similar to LC3B (Extended Data Fig. 5a), we tested the compounds 
ina HD transgenic fly model expressing human full-length mHTT. All 
of the mHTT-LC3 linker compounds that we identified significantly 
reduced mHTT levels in Drosophila (Extended Data Fig. 5b), validating 
the in vivo efficacy of these compounds. 

We further investigated the in vivo effects of the compounds using 
the HD-knock-in mouse model (Hdh®’“"?)” by intracerebroventricular 
injections. Treatment with three of the four linker compounds (1005, 
AN1or AN2, but not 8F20) led to significant lowering of mHTT in corti- 
ces of HD mice (Extended Data Fig. 6a). We then performed intraperi- 
toneal injection of 1005 and AN2 at 0.5 mg kg? in HD-knock-in mice. 
The compounds crossed the blood-brain barrier and reached the brain 
at detectable concentrations (Extended Data Fig. 5c, approximately 
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Fig. 4| Linker compounds enhance the mHTT-LC3 interaction and tether 
MHTT to autophagosomes. a, b, Representative results (from three biological 
repeats) of in vitro pull-down experiments using purified HTT and LC3B 
proteins (see Methods for details). c, d, Representative images (scale bar, 


20-200 nM for 1005 and 20-40 nM for AN2; no signal was detected in 
the DMSO-injected control group) 0.5-6 h after injection. Consistent 
with these results, we observed significant allele-selective lowering 
of mHTT in mouse cortices and striata (Extended Data Fig. 6b, c). The 
observed lowering was not due to changesin mHTT solubility, because 
noincrease of mMHTT aggregates was observed in the cortical tissues of 
mice treated with these compounds (Extended Data Fig. 6d). 


Linkers tether mHTT to autophagosomes 


We then examined whether these compounds actually function as 
linkers between mHTT and LC3 to target mHTT for autophagosome 
engulfment. The presence of 1005 or AN2, the two compounds that 
were effective by intraperitoneal injection in vivo, markedly enhanced 
the mHTT-LC3 interaction in in vitro pull-down experiments (Fig. 4a, 
comparing lane 11 with 12 and 13; Fig. 4b, comparing lane 11 with 12, 
top and bottom). There was no enhancement effect for the wild-type 
HTT-LC3 interaction (Fig. 4a, lanes 8-10; Fig. 4b, lane 9 and 10, top 
and bottom). Consistent with this, these compounds led to increased 
engulfment of mHTT by autophagosomes, both in transiently trans- 
fected HeLa cells expressing exogenous GFP-LC3B and in HT Texon1- 
MBP-His fragments (Fig. 4c), andin mouse striatal cells (STHdh22/2)>1 
expressing endogenous LC3 and full-length mHTT proteins (Fig. 4d). 

These data confirmed that the compounds tether mHTT, LC3B and 
autophagosomes in vitro and in cells, although the detailed structural 
information remains to be resolved. 


Linkers do not influence autophagy function 


The lowering of mMHTT levels by the linker compounds was unlikely to 
be aresult of enhanced autophagy, because the number and size of 
autophagosomes remained unchanged (Extended Data Fig. 7a). We fur- 
ther investigated whether the compounds could influence autophagy 
using established approaches” **. Neither 1005 nor AN2 influenced the 
autophagosome-lysosome fusion or autophagy activity (Extended 


Hdh2111/0111 striatum (mHTT LC3-ll) 
DMSO 1005 


AN2 HTT siRNA 


F(2, 90) = 34.57 


10 pm) and quantification of the co-localization between HTT and 

autophagosomes in HeLa cells transfected with the indicated cDNA plasmids 
(c) or striatum from Hdhe!”“ mice (d). Dataare mean +s.e.m. nindicates the 
number of cells. One-way ANOVA with post hoc Dunnett’s test. ****P< 0.0001. 


Data Fig. 7b-d). Furthermore, we observed no changes in LC3-II lev- 
els in the cultured cortical neurons treated with 1005 or AN2 inthe 
absence or presence of the lysosome inhibitor bafilomycin Al (bafA1) 
(Extended Data Fig. 7e). The level of the known autophagy-selective 
substrate protein SQSTMI (also known as p62) was also unaffected 
in vivo and in cultured neurons (Extended Data Figs. 7f, 8a). In addition, 
other wild-type polyQ proteins (ATXN3 and TBP) and control proteins 
(NBR1, NCOA4, actin, GAPDH and tubulin) were not influenced (that 
is, any change amounted to less than 10%) (Extended Data Fig. 8a). 
We then performed proteomics analysis to obtain amore complete 
overview of proteins that may have been influenced by these com- 
pounds. We observed significant lowering (about 20%, P< 0.01) of 
HTT levels in cortices of mice injected intraperitoneally with 1005 or 
AN2 (Extended Data Fig. 8b, bar plots). As the proteomics analysis was 
unable to distinguish MHTT from wtHTT, the actual reduction in mHTT 
is likely to be higher than this. Meanwhile, using the criteria of P< 0.01, 
we observed changes in only asmall percentage of proteins (Extended 
Data Fig. 8b; see Supplementary Table 2 for details). No autophagy- 
specific substrate proteins exhibited significant changes and there 
was no enrichment of proteins associated with the autophagy path- 
way (Supplementary Table 2), further confirming that autophagy was 
unaffected. Proteomics analysis in cultured neurons gave consistent 
results (Extended Data Fig. 8c; see Supplementary Table 3 for details). 


Linker compounds depleted expanded polyQ proteins 


The linker compounds interacted with and lowered mHTT but not 
wtHTT (Fig. 1). The simplest explanation for this specificity is that the 
compounds specifically interact with the expanded polyQ tract, pos- 
sibly by recognizing its emergent conformation, which is different 
from that of the short polyQ stretch”). If so, the linker compounds 
may also affect other proteins with expanded polyQ regions. Consist- 
ent with this prediction, compounds 1005, AN1 and AN2 reduced the 
levels of mutant but not wild-type ATXN3 in fibroblasts from patients 
with spinocerebellar ataxia type 3 (SCA3) (Extended Data Fig. 9a) and 
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Fig. 5| Linker compounds rescue HD-relevant phenotypes in cells and 

in vivo. a, Left, representative DAPI staining and immunostaining of the 
neuronal-specific tubulin marker TUBB3, showing neuronal morphology of 
patient iPS-cell-derived striatal neurons (HD: Q47; WT: Q19) treated with 
indicated compounds. Right, quantification of TUBB3 area per cell. Scale bar, 
50 um. b, Neuronal apoptosis at different time points after BDNF removal, 
measured using a green fluorescent dye (NucView 488) to detect active 
caspase-3.c, Left, Kaplan-Meier survival curves of transgenic Drosophila with 


exogenously expressed 72Q-GFP, 46Q-GFP and 38Q-GFP but not 
25Q-GFP proteins (containing on Met-polyQ-sfGFP sequences) in 
HEK293T cells (Extended Data Fig. 9b). These data suggest that the 
compounds distinguished the expanded polyQ stretch from the short 
polyQstretch at a threshold between 25Q and 38Q. To further confirm 
this, we tested the interactions of these compounds with polyQ motifs 
(Extended Data Fig. 9c) and confirmed that 1005, AN1 and AN2 interact 
with polyQ-GFP with 38 or more glutamine residues, but not 25Q—-GFP 
or GFP alone (Extended Data Figs. 2, 9d, e). 


Linker compounds rescued HD-relevant phenotypes 

We further investigated the therapeutic potential of the compounds 
for treating HD. All the mHTT-LC3 linker compounds rescued mHTT 
toxicity in neurons derived from iPS cells of patients with HD (Fig. 5a, b). 
They also rescued HD-relevant behavioural deficits and increased the 
lifespan of flies expressing human mHTT, while having no influence on 
the flies expressing wtHTT (Fig. 5c). 

Finally, we investigated the disease-relevant behavioural phenotypes 
in ten-month-old heterozygous HD-knock-in mouse (Hdh2”2""), HD 
mice exhibited significant deficits in several behavioural tests, includ- 
ing rotarod, balance beam and gripping force tests (Extended Data 
Fig. 9f-h). Intraperitoneal injection of 1005 or AN2, but not of DMSO 
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theindicated transgenes and treatments. Right, climbing performance of the 
treated transgenic flies as a function of age after eclosion. d-f, Mouse 
behavioural tests showing improvement of HD-relevant phenotypes after 
intraperitoneal injection of the indicated compounds at 0.5 mg kg". g, Amodel 
showing how mHTT-LC3 linker compounds may induce mHTT degradation, 
illustrating the concept of lowering target protein levels using 
autophagosome-tethering compounds. The images representing HTT are 
reproduced from ref.’ under a CC BY license. 


only, significantly improved HD-relevant behavioural deficits in these 
tests, without influencing the wild-type mice (Fig. 5d-f), demonstrating 
arescue of HD-relevant phenotypes. This is a proof-of-principle study, 
and further investigations will be required to establish the suitability 
for therapeutic application. 


Discussion 


We have identified mHTT-LC3 linker compounds that are able to 
reduce MHTT levels at nanomolar concentrations in HD cells and at 
0.5 mg kg‘ by intraperitoneal injection in vivo (Extended Data Table 1). 
The compounds did notinfluence wtHTT, which has essential functions— 
especially during development and young adulthood”. These features 
of the compounds are highly desirable for the treatment of HD and 
potentially for the treatment of other polyQ diseases (Extended Data 
Fig. 9a—-e), although preclinical studies of longitudinal efficacy and 
safety will be necessary for therapeutic development. 

From a broader perspective, we have demonstrated the concept 
of using small-molecule compounds to target proteins (for example, 
mHTT) for autophagic degradation by linking them to LC3 (Fig. 5g). We 
selected mHTT as the target protein because wtHTT provides a good 
internal control for screening. We identified compounds that interact 
with both LC3B and mHTT; however, if no such compounds had been 


identified, linker compounds could still be generated by conjugating 
a mHTT-interacting compound and an LC3-interacting compound 
using the nucleophile-isocyanate reaction used to create the SMMs. 
The critical next step in developing this concept will be to resolve the 
core chemical moiety that interacts with LC3 without influencing its 
function. Comprehensive medicinal chemistry and structural studies 
are needed to resolve the compound-LC3 interaction interface, which 
could then be developed to create a general degradation-targeting 
tool for conjugation with other compounds that interact with specific 
proteins of interest. 

Insummary, we have identified mHTT-LC3 linker compounds that 
are capable of lowering mMHTT levels in vivo in an allele-selective manner 
and demonstrated the possibility of targeting proteins for degradation 
using autophagosome-tethering compounds, providing new entry 
points for drug discovery. 
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Methods 


Additional details from figure legends 

In Fig. 5a, loss of processes and shrinkage of neurons were observed 
in HD neurons after BDNF removal. Bar plots show quantification of 
the area showing TUBB3 signal (TUBB3 area) normalized to the nuclei 
counts based on DAPI staining. The lower TUBB3 area per cell reflects 
neuronal processes shrinkage and loss. Data were normalized to the 
average of wild-type controls. The data were analysed by one-way 
ANOVA (F(5, 60) = 94.78) with post hoc Dunnett’s tests. ****P< 0.0001; 
nindicates the number of independently plated wells. 

Figure 5b, the images were captured every 3 h inside the incubator 
using Incucyte, and the caspase-3 active cells were quantified by the 
fluorescent-object count per field. The data were analysed by two-way 
ANOVA (F(43, 516) = 12.85) with post hoc Dunnett’s tests, comparing to 
the HD_DMSO group. ****P< 0.0001. The numbers in brackets indicate 
the number of independently plated wells, with four fields per well 
imaged and averaged for quantification. Three batches were tested 
and showed consistent results. 

In Fig. 5c, left, Drosophila expressed full-length HTT proteins (Q128 or 
Q16) inthe nervous system driven by elav-GAL4. Seventy-five flies were 
tested for each group. The data were analysed by log-rank (Mantel-Cox) 
test, comparing compound-treated groups with DMSO controls in Q128 
flies. ****P< 0.0001. Figure Sc, right, similar to Fig. 5c, left, but plotting 
the climbing performance asa function of age after eclosion. Data were 
analysed by two-way ANOVA (F(4, 275) = 122.1) with post hoc Dunnett’s 
tests, comparing the compound treated groups with the DMSO controls 
in Q128 flies. Numbers in brackets indicate the number of vials (each 
containing 15 flies) tested. ****P< 0.0001. 

In Fig. Sd-f, the numbers in brackets indicate the number of mice 
tested. The data were analysed by two-way ANOVA with post hoc Dun- 
nett’s tests, and Pvalues were calculated for the comparison with the 
DMSO control. ****P< 0.0001. For HD mice, F(2, 195) = 4.963 in rotarod 
tests, F(2, 195) = 37.31 in balance beam tests, and F(2, 156) = 7.068 in 
gripping force tests. No significant difference was detected among 
wild-type mice injected with the different compounds. Investigators 
were blinded to the compounds and genotypes when performing the 
experiments. In all panels, graphical data are presented as mean and 
s.e.m. 


Compound stamping on the microarray 

SMMs containing 3,375 bioactive compounds were used for high- 
throughput screening of target proteins. The compound library 
containing 1,527 drugs approved by Food and Drug Administration 
(FDA) of United States, 1,053 natural products from traditional Chinese 
medicine, and 795 known inhibitors were stamped onto the SMMs. 
Each compound was dissolved in DMSO at a concentration of 10 mM 
and printed in duplicates along vertical direction on homemade phe- 
nyl-isocyanate functionalized glass slides with a contact microarray 
printer (SmartArrayer 136, CapitalBio Corporation). Biotin-BSA ata 
concentration of 7,600 nM in 1x phosphate-buffered saline (PBS) and 
biotin-(PEG),-NH, at a concentration of 5mM in DMSO were printed 
as the inner and outer borders of SMMs, respectively. The diameter of 
each spot was about 150 pm and spacing between two adjacent spots 
was 250 um. The printed SMMs were then dried at 45 °C for 24 hto 
facilitate covalent bonding of nucleophilic groups of small molecules 
to isocyanate groups of the functionalized slides. Afterwards, the SMMs 
were stored ina —20 °C freezer. 


Expression and purification of recombinant proteins 

The human microtubule-associated protein 1 light chain 3-$ (MAPILC3B 
(LC3B)) gene (GenBank: NM_022818.4) was amplified by PCR and cloned 
into a pGEX-6P1 (GE Healthcare) derived vector pGHT, whichis a prokar- 
yotic expression vector reconstructed by adding a His, tag and a TEV 
protease cleavage site before the pGEX-6P1 multiple cloning site. After 


sequencing verification, the expression plasmid pGHT-LC3B was intro- 
duced into Escherichia coli BL21 (DE3) pLsyS, in which the recombinant 
GST-LC3B protein was expressed by induction with IPTG. When the 
bacterial culture reached OD,,, = 0.8, its temperature was decreased 
to18 °C, and 0.2 mM IPTG was added into the culture for an additional 
20 hincubation. The cells werethen harvested by centrifugation (6,000g, 
4°C,15 min) and the cell pellet was suspended in 50 mM Tris-HCl buffer, 
pH 7.5, with 150 mM NaCl and 5% glycerol. Cells were then disrupted 
by sonication, followed by centrifugation (20,000g, 4 °C, 60 min). 
The supernatants were then loaded onto a HisTrap HP column 
(GE Healthcare, cat. no. 17524701), and eluted with 50 mM Tris-HCl 
buffer, pH 7.5, containing 150 mM NaCl, 5% glycerol and 300 mM imi- 
dazole. The LC3B eluate was then mixed with TEV protease (Sigma, cat. 
no. T4455; eluted protein: TEV protease = 100:1) and dialysed against 
the dialysate buffer (SO mM Tris-HCl buffer, pH 7.5, containing 100 mM 
NaCl) in4 °C overnight. After TEV protease treatment, the samples were 
then loaded onto a HisTrap HP column again, the flowthrough fraction 
which mainly contains tag removed recombinant LC3B. Afterwards, 
the proteins were concentrated and further purified by Superose 6 
Increase 10/300 GL (GE Healthcare) size-exclusion chromatography. 
Finally, the purified proteins were concentrated to approximately 10 
mg ml7inSOmMHEPES buffer with 100 mM NaCl for further analysis. 
The MBP-His, and Rpn10 proteins were purified similarly. 

The full-length HTT proteins, HTTexon1-MBP, polyQ-sfGFP and 
sfGFP were purified from mammalian cells. For full-length HTT pro- 
teins, the human HTT gene (GenBank: NM_002111.8) with (CAG),; 
or (CAG),,; (23Q or 73Q for proteins) were de novo synthesized (by 
Genewiz), sequence validated and then cloned into a modified pCAG 
vector with an N-terminal protein A tag. The plasmid was transfected to 
human embryonic kidney E293 cells using polyethylenimine (PEI, from 
Polysciences, cat. no. 23966). After culture at 37 °C for 48 to 60 h, cells 
were collected and lysed at 4 °C for 1hin lysis buffer containing 50 mM 
Tris-HCl, pH 8.0, 150 mM NaCl, 5% glycerol, 0.5% CHAPS, 3 mM DTT, 1% 
PMSF, 1 pg mI pepstatin, 1 pg ml“ leupeptin and 1 pg mI aprotinin, 
5mM ATP and 5mMMgCl.. After centrifugation at 15,000 r.p.m. for 
40 min, the supernatants were then incubated with IgG monoclonal 
antibody-agarose (Smart-lifesciences, cat. no. SA030010) for 2 hand 
unbound proteins were extensively washed away. The HTT proteins 
were then digested using TEV protease overnight to remove the protein 
A tag and eluted protein was further purified by ion exchange and gel 
filtration chromatography using Mono Q and Superose 6 (5/150 GL) 
columns from GE healthcare. The peak fractions were pooled for further 
biochemical analysis. The HTTexon1 with 25Q or 72Q cDNA were also 
de novo synthesized and cloned into a mammalian expression vector 
pTTSSH8Q2 for large scale production in HEK293T cells. In order to 
improve the production yield and increase the solubility, a C-terminal 
MBP tag was added after the HT Texon1 sequences to generate the pTT- 
HTTexon125Q-MBP and pTT-HTTexon125Q-MBP plasmids. For protein 
production and purification, the HEK293T cells were transfected by 
pTT-HTT25QExon1-MBP and pTT-HTT72QExon1-MBP plasmids with lin- 
ear PEI (PolySciences cat. no. 24765), and then collected after 48 h. The 
cells were then lysed by sonication in buffer containing 50 mM Tris-HCl, 
pH/7.5, 150 mM NaCl, 20 mM imidazole, 5% glycerol, protease inhibitor 
cocktail (Sigma) and 50 U mI benzonase (Sigma). After centrifugation, 
the supernatants were loaded onto HisTrap HP column (GE Healthcare), 
and eluted with the buffer containing 50 mM Tris-HCl, pH 7.5, 150 mM 
NaCl, 300 mM imidazole, 5% glycerol and protease inhibitor cocktail. 
The MBP tag was not cleaved to avoid precipitation. Afterwards, the 
proteins were concentrated and further purified by Superose 6 Increase 
10/300 GL (GE Healthcare) size-exclusion chromatography. 


Verifications of the recombinant proteins by MALDI-TOF 

The purified LC3B, HT Texon1Q25-MBP, and HT Texon1Q72-MBP pro- 
teins were dialysedintoSmMNH,AcbySuperose 6Increasesize-exclusion 
chromatography for linear mode matrix assisted laser desorption 


ionization-time of flight mass spectrometry (MALDI-TOF) analysis on 
a Bruker FLEX MALDI-TOF instrument. A total of 1,500-2,500 scans 
were averaged for each spectrum using an accelerating voltage of 25 kV. 
Sinapinic acid (SA, Bruker, cat. no. 820135) was used as the matrices for 
protein and peptide analyses. Sinapinic acid was made into 20 mg mI 
solutions in 70% acetonitrile, 0.1% trifluoroacetic acid. For the acquisi- 
tion of spectra from10,000 to 100,000 amu, 2 pl of sample was mixed 
with 2 pl of sinapinic acid solution in an Eppendorf tube, and 2 pl of the 
mixture was loaded onto the MALDI plate. The calibration peptides for 
this range were BSA (M + 66,431) (Sigma, cat. no. A1933). All spectra 
were obtained in positive linear mode. The amount of full-length HTT 
proteins were limited, and thus not validated by MALDI-TOF. Instead, 
they were further purified by ion exchange and gel filtration chroma- 
tography, and validated by Coomassie blue staining (Extended Data 
Fig. le) and western blot (Fig. 4b). 


Verifications of the recombinant LC3B by X-ray diffraction 
crystallography 

Because the deletion of G120 (lipidation site) stabilizes LC3B protein, we 
used LC3B(AG120) protein to obtain high-resolution diffraction data. 
Purified LC3B(AG120) protein was concentrated in20 mM HEPES pH7.5, 
150 mM NaCl. The LC3B(AG120) crystal was grown in reservoir solutions 
consisting of 0.16 Mammonium sulfate, 0.08 M sodium acetate pH 4.6, 
20% (w/v) PEG4000, 20% (v/v) glycerol and 0.01M taurine. 


Refinement 

The X-ray diffraction data were collected at 100 K inthe beamline BL17U1 
and BL19U1, SSRF. The wavelength for data collection was 0.97892 
A. Diffraction images were indexed and processed by HKL2000. The 
structure of LC3B(AG120) (PDB ID: 6J04, 1.90A) was solved by molecular 
replacement with the Phaser 2.8 program from the CCP4 crystallog- 
raphy package using PDB structure 1UGM as the search model. The 
refinement was performed by Refmac 5.5 and Phenix 1.14. There are 
no Ramachandran outliers to report. The related figure was drawn 
using PyMOL 2.2. 


Compound-protein interaction measurements by OI-RD 
For high-throughput preliminary screening of target proteins, aSMM 
was assembled into a fluidic cartridge and washed in situ with a flow 
of 1x PBS to remove excess unbound small molecules. After washing, 
the SMM was scanned with a label-free OI-RD scanning microscope to 
image small molecules immobilized on glass slides. After it was blocked 
with 7,600 nM BSA in 1x PBS for 30 min, SMM was incubated with the 
target protein for 2h. HTTexon1(Q25)—-MBP at a concentration of 
454 nM, HT Texon1(Q72)-MBP at a concentration of 238 nM, and LC3B 
at a concentration of 680 nM were screened on separate fresh SMMs. 
OI-RD images were scanned for each operation, including washing, 
blocking and incubation. The OI-RD difference images (images after 
incubation — images before incubation) were used for analysis, and 
vertical bright doublet spots indicated compounds that bind with target 
proteins in both replicates. Compounds 8F20 and 1005 were identified 
to bind to HT Texon1(Q72)—MBP and LC3B, but not to HT Texon1(Q25)- 
MBP. The binding was further confirmed by kinetics measurements. 
To measure binding kinetics of target proteins with compounds, we 
prepared new SMMs consisting of 8F20, 1005 and AN2. Six identical 
microarrays were printed on one glass slide and each compound was 
printed in triplicates in a single microarray. The printed small SMMs 
were assembled into a fluidic cartridge with each microarray housed in 
a separate chamber. Before the binding reaction, the slide was washed 
insitu witha flow of 1x PBS to remove excess unbound samples, followed 
by blocking with 7,600 nM BSA in 1x PBS for 30 min. For binding kinetics 
measurement, 1x PBS was first flowed througha reaction chamber ata 
flow rate of 0.01 ml min” for 5 min to acquire the baseline. 1x PBS was 
then quickly replaced with the probe solution of the target protein at 
a flow rate of 2 ml min" for 9 s followed by a reduced flow rate at 0.01 


ml min‘ to have the microarray incubated in the probe solution under 
the flow condition for 35 min (association phase of the reaction). The 
probe solution was then quickly replaced with 1x PBS at a flow rate 
of 2 ml min™ for 9 s followed by a reduced flow rate of 0.01 ml min™ 
to allow dissociation of probe for 30 min (dissociation phase of the 
reaction). By repeating the binding reactions of the target protein at 
three different concentrations on separate fresh microarrays, binding 
curves of compounds with the target protein at three concentrations 
were recorded with scanning OI-RD microscope. Reaction kinetic rate 
constants were extracted by fitting the binding curves globally using 
1-to-1 Langmuir reaction mode. 


Compound-protein interaction measurements by MST 

The purified recombinant proteins were dialysed into 1x PBS, andthen 
labelled according to the protocol of Protein labelling kit RED-NHS 
(Nanotemper, cat. no. LOO1). All the tested stock compounds (10 mM) 
dissolved in DMSO were also diluted into the same buffer for the final 
MST assay. The MST experiment was performed using Monolith NT.115 
instrument (NanoTemper Technologies). Labelled proteins (SOO nM) 
were mixed with the indicated concentrations of candidate compounds 
inreaction buffer containing 20 mM HEPES, pH 7.4, 150 mM NaCl. The 
MST data were then collected under 40% infrared laser power and 20% 
light-emitting diode power. The data were analysed by Nanotemper 
analysis software (v.1.5.41) and the K, was determined. 


cDNA plasmids for transfection in mammalian cells 

The pEX-GFP-hLC3WT plasmid was obtained from Addgene (24987) to 
express LC3B. The pTT-HT Texon1-Q72-MBP-His and pTT-HT Texon1- 
Q25-MBP-His were generated by subcloning HTTexon1 cDNAs into 
the mammalian expression vector pTT-MBP-His and then transiently 
transfected into HeLa cells to express HTTexonl proteins for the 
colocalization experiments. The polyQ-GFP sequences (expressing 
Met-polyQ-sfGFP) were de novo synthesized and subcloned into the 
pcDNA vector. All plasmids were sequence validated. For transient 
transfections, the cells were plated at 50% confluence. After 24 h, the 
cDNAs were transfected with Lipofectamine 2000 (Thermo Fisher 
Scientific, cat. no. 11668019) using the forward transfection protocol 
provided by the manufacturer. 


Cell culture 
For mouse primary cortical neuron cultures, cortices were isolated from 
postnatal day O pups following genotyping. Cortices were dissected 
into cold Ca”*- and Mg”*-free PBS buffer. Chopped small pieces were 
digested in solution containing 2.5% trypsin (Sigma, cat. no. P1005) 
and DNasel (0.1mg mI“, Sigma, cat. no. D5025), for 20-30 min at 37 °C. 
Tissues were transferred to 10% FBS containing DMEM (Thermo Fisher 
Scientific, cat. no. 11965) to cease digestion. Neurons were then dissoci- 
ated by trituration with fire-polished glass pipettes, collected by spin- 
ning and plated onto polylysine-coated dishes at 4 x 10° cells per 35-mm 
dish. The growth medium was composed of Neurobasal A medium 
(Thermo Fisher Scientific, cat. no. 10888022) with 1x B-27 (Thermo 
Fisher Scientific, cat. no. 17504044) and 1x N2 supplement (Thermo 
Fisher Scientific, cat. no. 17504048). Cytosine-arabinofuranoside 
(Sigma, cat. no. C1768) was added at 6 uM to inhibit glial growth. 
Some of the primary patient fibroblasts were obtained from HD 
patients (Q47, Q49, Q55) and healthy sibling (WT, Q19) controls ina 
family with HD from Mongolia. The HD Q68 fibroblast line was obtained 
from Coriell Cell Repositories. The PD line was obtained from an idi- 
opathic Parkinson’s disease patient, and the SCA3 line was obtained 
from a patient with SCA3 harbouring the ATXN3 expansion mutation 
(Q74). The studies were approved by The Ethic Community of Institutes 
of Biomedical Sciences at Fudan University (#28) for obtaining the HD 
and wild-type patient fibroblasts, and by Huashan Hospital Institu- 
tional Review Board at Fudan University (#174) for obtaining the PD and 
fibroblasts from patients with SCA3. Verbal and written consent was 
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obtained from patients. The procedures were in compliance with all rel- 
evant ethical regulations. The immortalized fibroblasts were generated 
by infection of lentivirus expressing SV4OT. For generation of iPS cells, 
the primary fibroblasts were transduced with the retroviral STEMCCA 
polycistronic reprogramming system (Millipore, cat.no. SCR548). The 
iPS cells were confirmed positive for Tra-1-81, Tra-1-60, SSEA-4 and 
Nanog by immunofluorescence and flow cytometry. All four vector- 
encoded transgenes were found to be silenced and the karyotype was 
normal. iPS cells were cultured in E8 medium (Thermo Fisher Scientific, 
cat. no. A1517001) on Matrigel (Corning, cat. no. 354277) surface. iPS 
cells were differentiated to Pax6-expressing primitive neuroepithelia 
(NE) for 10-12 days in a neural induction medium. Sonic hedgehog 
(SHH, 200 ng mI) was added on days 10-25 to induce ventral pro- 
genitors. For neuronal differentiation, neural progenitor clusters were 
dissociated and placed onto poly-ornithine/laminin-coated coverslips 
at day 26 in Neurobasal medium (Thermo Fisher Scientific, cat. no. 
21103049), with 1x B-27 (Thermo Fisher Scientific, cat. no. 17504044), 
1x N-2 (Thermo Fisher Scientific, cat. no. 17504048), brain derived 
neurotrophic factor (BDNF, 20 ng/ml, Protech, cat. no. 450-02), glial- 
derived neurotrophic factor (GDNF, 10 ng/ml, Protech, cat. no. 450-10), 
insulin-like growth factor 1 (IGF1, 10 ng/ml, Protech, cat. no. 100-11) and 
Vitamin C (Sigma cat. no. D-0260, 200 ng/ml). The mouse striatal cells 
(STHGh) were obtained from Coriell Cell Repositories. The HEK293T 
cells and the HeLa cells were originally obtained from American Type 
Culture Collection (ATCC). STHdh, HeLa and HEK293T cells were cul- 
tured in DMEM (Thermo Fisher Scientific, cat. no. 11965) with 10% (vol/ 
vol) FBS (Thermo Fisher Scientific, cat. no. 10082-147). AtgS WT and 
KO MEFs were from N. Mizushima. All the mammalian cell lines were 
maintained at 37 °C incubator with 5% CO,, except STHdh cells, which 
were maintained at 33 °C with 5% CO.,. The cells were tested every two 
months by a TransDetect PCR Mycroplasma Detection Kit (Transgen 
Biotech, cat. no. FM311-01) to ensure that they are mycoplasma free. 
The CellTiter-glo assay was performed to measure cell viability with 
the indicated compound treatment (Extended Data Fig. 2e) following 
the protocol provided in the kit (Promega, cat. no. G7570). 


HD Drosophila models 

The nervous system driver line elav-GAL4 (c155), and the HTT-express- 
ing lines UAS-flLHTT-Q16 and the UAS-flHTT-Q128 (expressing human 
full-length HTT with 16Q and 128Q, respectively, when crossed to the 
GAL4 line) lines were obtained from the Bloomington Drosophila Stock 
Center at University of Indiana (http://flystocks.bio.indiana.edu/), and 
maintained in a 25 °C incubator. Crosses were set up between virgin 
female flies carrying elav-GAL4 driver and the UAS-flHTT-Q16 or UAS- 
flHTT-Q128 male flies to generate the desired genotypes. 


HD mouse models 

The generation and characterization of the Hdh140Q knock-in mice 
have been previously described”. Mice were group-housed (up to 5 
adult mice per cage) in individually vented cages witha 12 h light/dark 
cycle. The mouse experiments were carried out following the ARRIVE 
(Animal Research: Reporting of In vivo Experiments) guidelines, and 
they were in compliance with all relevant ethical regulations. The Ani- 
mal Care and Use Committee of the School of Medicine at Fudan Uni- 
versity approved the protocol used in animal experiments (Approval 
20140904 and 20170223-005). 


Compound treatment in cells and animals 

The compounds used in this study were all commercially available, 
and quality controlled by the vendors using NMR. 1005: GW5074 (DC 
Chemicals; cat.no. DC8810); 8F20: ispinesib (Selleck; cat.no. $1452); 
AN1: 5-bromo-3-[(4-hydroxyphenyl)methylidene]-2,3-dihydro-1H-in- 
dol-2-one (Specs; cat. no. AN-655/15003575); AN2: 5,7-dihydroxy-4-phe- 
nylcoumarin (ChemDiv; cat.no. D715-2435); GSK923295 (Selleck, cat. 
no. $7090), BAY1217389 (Selleck, cat.no. $8215), PLX-4720 (Selleck, 


cat.no. $1152), Dabrafenib (Selleck, cat.no. S2807), Sorafenib Tosylate 
(Selleck, cat.no. $1040), rapamycin (Sigma-Aldrich, cat. no. R8781). 

For compound treatment in the cells, the compounds were diluted 
in culture medium to 10x concentrations and added to the plated cells: 
for primary cultured neurons and iPS-cell-derived neurons, the com- 
pounds were added 5 days after plating; for patient fibroblasts and 
other cell lines, the compounds were added 1 day after plating. The 
cells were then collected 2 days later for measurement of HTT levels. 
For detection of HTT-LC3 colocalization, the cells were fixed 4 h after 
compound treatment. For caspase-3 activation detection, the cells 
were stressed (BDNF removal for iPS-cell-derived neurons) 1 day after 
compound treatment, and tested at the indicated time points. 

For compound treatment in the Drosophila, flies were maintained 
in standard maize food at 25 °C. For drug feeding, maize media was 
heated to 45 °C until liquid and distributed into vials. Compounds were 
freshly prepared in DMSO and added to the media. New adult flies were 
transferred to vials with 400 pL the control (DMSO) or compound- 
containing food, which was changed every other day. 

For compound treatment in mice using intracerebroventricular (icv) 
injection, the 3-month-old mice were anesthetized using a small animal 
anaesthesia machine (MSS-3, MSS International) by isoflurane (1.5% 
solution). We surgically implanted each mouse with a guide cannula 
directed towards the lateral ventricle. The coordinates for implantation 
were determined using “The Mouse Brain in Stereotaxic Coordinates” 
and the guide cannulas were placed at 0.6 mm posterior, 1.5 mm lateral 
(left), and 1.7 mm dorsal with respect to bregma. A cap with stylus was 
then inserted into the guide cannula to seal its opening. Mice were 
then allowed to recover from surgery for a week before being treated. 
For injection, we first inserted an internal injector cannula so that it 
extended 0.5mm beyond the tip of the guide cannula to reach the lateral 
ventricle. We then injected the mice through the internal injector can- 
nula using a 25 pL syringe (Hamilton 1700 Series Microlitre Syringes, 
Bonaduz, GR, CH) at a flow rate of 0.25 pL/min powered by a syringe 
pump (KDS Legato 130) to administer 2 pL of compounds-containing 
artificial cerebrospinal fluid (ACSF: 1mM glucose, 119 mM NaCl, 2.5 mM 
KCI, 1.3 mM MgSO,, 2.5 mM CaCl,, 26.2 mM NaHCO;, 1mM NaH,PO,) at 
aconcentration of 25 uM (containing 0.125% vol/vol DMSO). 2 uL ACSF 
containing equivalent amount of DMSO (0.125% vol/vol) was used as 
the control. The injector cannula was left in place for approximately 
60 s to allow for diffusion before placing the caps with stylus back in 
guide cannulas. 

For compound treatment in mice using intraperitoneal (ip) injec- 
tion, each mouse was weighed. The compounds were diluted with 0.9% 
NaCl intravenous infusion solution to 0.05 pg/L (containing 0.011 pg/ 
pL DMSO) and injected into each mouse based on the weight of the 
mouse (500 pg/kg, containing 110 pg/kg DMSO). As controls, equiva- 
lent amount of DMSO was diluted and injected in the same way. Injec- 
tion of 0.9% NaCl intravenous infusion solution alone was also tested 
and showed no difference (Extended Data Fig. 9f-h). One injection 
per day was performed for two weeks before subsequent behavioural 
experiments or tissue extractions. 

Note that in some of the experiments (Figs. 4,5 and Extended Data 
Fig. 6b-d), 8F20 and/or AN1 were not tested. 8F20 was not tested 
because it did not have an effect in vivo by icv-injection (Extended 
Data Fig. 6a). AN1 was not tested because its structure is highly simi- 
lar as 1005 while it had a weaker HTT-lowering effect by icv-injection 
(Extended Data Fig. 6a). 


Protein extraction from cells and tissues 

For protein extraction from cells, the cell pellets were collected and 
lysed onice for 30 minin1x PBS +1% Triton X-100 +1 complete protease 
inhibitor (Sigma-Aldrich, cat. no. 11697498001), sonicated for10s, and 
spun at >20,000gat 4 °C for 15 min. The supernatants were then loaded 
and transferred onto nitrocellulose membranes for western blotting. 
For mouse brain tissues, the mouse striata and cortices were dissected 


onice and grinded by a tissue grinder for 5 min at 60 Hz and lysed onice 
for 60 min in brain lysis buffer (SO mM Tris, 250 mM NaCl, 5 mM EDTA, 
1% Triton X-100 PH7.4) + 1x complete protease inhibitor (Roche, cat. 
no. 4693159001). The samples were then sonicated for 10 cycles, 15s 
onand 20 s off, and then collected for western blot. 

For protein extraction from the mouse brain, the brains were col- 
lected andthe cortices were acutely dissected on ice and homogenized 
with a tissue grinder for 5 min at 60 Hz and lysed onice for 60 minin 
brain lysis buffer (SO mM Tris, 250 mM NaCl, 5mM EDTA, 1% (vol/vol) Tri- 
ton X-100, 1x complete protease inhibitor (Roche, cat.no. 4693159001), 
pH=7.4). The samples were then sonicated for 10 cycles, 15s on and 20 
s off, and then collected for western blots, HTRF or dot blots. 

For mHTT measurements inthe HD Drosophila model, the fly heads 
were collected at the age of 7 days and lysed on ice for 30 min in PBS 
+1% (vol/vol) Triton X-100 + 1x complete protease inhibitor (Roche, 
cat. no. 4693159001), sonicated for 10 cycles, 15 s on and 20s off, and 
then collected for HTRF. 

For all the samples, the protein concentrations were measured to 
correct the loadings. Different protein concentrations or cell num- 
bers per well were tested to ensure that the signals were in the linear 
range. Background corrections were performed by subtracting the 
background signals from blank samples. 


Western blot and filter trap assays 

For western blots, the samples were loaded onto the SDS page gel (5-12% 
depending onthe molecular weight of the protein of interest). The pro- 
teins onthe gel were then transferred to the nitrocellulose membranes 
for blocking and antibody detection. The signal was detected with ECL 
(Bio-Rad, cat. no. 1705061) after 1h incubation of the membrane with 
secondary antibody 1:10,000. 

The filter trap assay was performed similarly as previously 
described”’, 2 nL (10 pg) aliquots of each sample were loaded onto 
nitrocellulose membranes stacked in the Bio-Dot microfiltration appa- 
ratus (Bio-Rad). The membrane was blocked for 1h with 5% milk and 
incubated overnight with the antibody 4C9 at a concentration of 1.5 pg/ 
pl in 5% milk diluted in PBS + 0.1% Tween-20. The signal was detected 
with ECL (Bio-Rad, cat. no. 1705061) after 1h incubation of the mem- 
brane with secondary antibody 1:10,000. 


HTRF assays 

For HTRF, the assays were similar as previously described”. The cell or 
tissue lysates were diluted with the original lysis buffer PBS + 1% (vol/ 
vol) Triton X-100 + 1x complete protease inhibitor (Roche), used for 
lysing the samples, and then detected with indicated antibody pairs 
diluted in the HTRF assay buffer (SO mM NaH,PO,, 400 mM Naf, 0.1% 
BSA, 0.05% (vol/vol) Tween-20, 1% (vol/vol) Triton X-100, pH 7.4). The 
donor antibody concentration was 0.023 ng/uL and the acceptor anti- 
body concentration was 1.4 ng/L, bothin HTRF assay buffer. Different 
antibody pairs were used for different experiments as indicated inthe 
figure legends. For all the samples, the signals were normalized to the 
total protein concentrations to ensure equal loadings. Different protein 
concentrations were pre-tested to ensure that the signals were in the 
linear range. Background corrections were performed by subtracting 
the background signals from blank samples. 


In vitro c-Raf kinase assay 

In vitro c-Raf kinase assays were carried out with a c-Raf kinase assay 
kit (BPS Bioscience, cat. no. 79570). The assays were performed ina 
96-well plate according to the manufacturing instruction. The samples 
and non-reactive negative controls were tested in duplicate according 
to the instruction. 

For details, 25 pL of the mixture containing 5x kinase assay buffer 
(6 pL), ATP (1 pL), 5x Raf substrate (10 pL) and water (8 pL) was added toa 
well. 5 pL of water solution containing atest compound at a10~ desired 
concentration (DMSO was at 10% at the water solution) was added tothe 


25 uL of mixture, and 20 pL of 1x kinase assay buffer containing 2 ng/pL 
c-Raf kinase was added to the mixture in a well to initiate the kinase 
reaction (at this stage compounds were at 1x desired concentration, and 
DMSO was at 1% concentration). For a non-reactive negative control, 
20 uL of 1x kinase assay buffer containing no c-Raf was added to the 
mixture instead. The plate was incubated at 30 °C for 45 min. After the 
45-min reaction, 50 pL of kinase Kinase-Glo Max reagent (Promega, cat. 
no. V6071) was added to each well, and the plate was incubated at room 
temperature for 15 min, in the dark. The plate was read witha microplate 
reader (BMG Labtech) for luminescence reading. The luminescence 
reading value measures the levels of ATP remaining, whichis inversely 
related to kinase activity. The non-reactive negative control read value, 
indicating the level of initial added ATP, subtracted the level of ATP 
remaining (the luminescence reading) for the value of consumed ATP 
in the reaction that represents a kinase activity. 


In vitro pull-down assays 

We performed in vitro pull-down assays to test the compounds influ- 
ence on the HTT-LC3 interactions. The purified HT Texon1l (with the 
indicated tags), full-length proteins and the control proteins were 
incubated with amylose resin (New England BioLabs, cat. no. E8021L) 
at 4 °C for 30 min. Immobilized amylose resins were then washed three 
times with HBS (20 mM HEPES pH7.5, 150 mM NaCl, 0.05% Tween-20). 
The resulting amylose resins containing about 10 pg of MBP-fused 
proteins were incubated with the indicated compounds (1M for 1005 
and 100 nM for AN2) or the DMSO control at the same volume in 300 
pl of HBS at 4 °C for 1h using sample mixer. 40 pg of purified LC3B 
protein were then added and incubated at 4 °C for another 2 h using 
sample mixer. The resin-bound proteins were eluted with 40 pl maltose 
buffer (10 mM maltose, 20 mM HEPES, 150 mM NaCl, pH 7.5) and then 
added with 20 pI SDS-PAGE sample loading buffer. Samples were then 
analysed by SDS-PAGE and western blots. 

GST pulldown was performed as the same procedures described 
above, except that GST-fused LC3B was immobilized onto magnetic 
conjugated GST mouse mAb beads (Cell Signaling Technology, cat. 
no.11847S) and eluted with SDS-PAGE protein loading buffer by vortex 
according to the instruction manual. 

In Fig. 4a, b, for MBP pull-down (Fig. 4a), purified HT Texonl-MBP 
(10 pg) or MBP (10 pg) bound MBP resin were incubated with the purified 
LC3B protein (40 pg) and the indicated compounds. The HTTexon1- 
MBP or the MBP proteins were pulled down and the eluates were tested 
for co-precipitated LC3B. Four per cent of the total eluate was loaded in 
each lane, and the input:pull-down loading ratio was 100%. Both1005 
and AN2 enhanced LC3B’s interaction with HT Texon1-Q72-MBP, but not 
HTTexon1-Q25-MBP. Note that the MBP blot signals were much weaker 
for the Q72 protein, possibly because recognition of the MBP tag by the 
antibody was affected in the fusion protein. Meanwhile, data interpreta- 
tion was not influenced, because compound treatments did not alter the 
MBP signals for the Q72 protein (last three lanes). The GST pull-down 
(Fig. 4b) was performed similarly, except using full-length HT T-Q73 or 
full-length HTT-Q23 (both without fusion tags) and GST-LC3B proteins 
for the in vitro GST pull-down experiments to precipitate GST-LC3B 
or GST alone with its binding proteins, and then eluted for detection. 
Note that the pull-down is in the reverse direction of the pull-downin 
(Fig. 4a). The input:pull-down loading ratio for the GST blot was 100%, 
whereas the ratio for the HTT blot was 10% to avoid overexposure of 
the input. Both 1005 and AN2 enhanced LC3B’s interaction with the 
full-length HTT-Q73 but not the full-length HTT-Q23 protein. 


Imaging-based autophagy assays 

Analysis of GFP-LC3 puncta for measuring autophagosomes: HeLa cells 
stably expressing GFP-LC3 were generated by transfection of pEGFP 
C1-LC3, and positive clones were selected by 500 pg/ml G418. The cells 
were then treated with vehicle (DMSO, 0.1%), 1005, or AN2 for the indi- 
cated concentration, chloroquine (CQ, 20 pM) treatment was used as 
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acontrol. After 24 h, cells were fixed in 4% paraformaldehyde (PFA) 
for 10 min. Images were acquired with confocal microscopy (Leica 
SP8) by the observer blinded to the identity of the slides. The number 
and size of GFP vesicles per cell was determined by ImageJ software. 
Images were processed with the despeckle function to decrease the 
noise, anda threshold was set to highlight puncta. Cells were selected 
by the freehand drawing tool. The analyse-particle function was used 
for the sizes and numbers of GFP puncta. 

The mRFP-GFP-LC3 assay: this assay allows us to monitor autophago- 
some synthesis and maturation/fusion by labelling autophagosomes 
(green and red) and autolysosomes (red), since the low lysosomal pH in 
autolysosomes quenches the GFP signals*. HeLa cells stably expressing 
mRFP-GFP-LC3” were treated with vehicle (DMSO, 0.1%), 1005, or AN2 
for the indicated concentration, bafA1 (10 nM) treatment was used as 
acontrol. After 24 h, cells were fixed in 4% PFA for 10 min. Images were 
acquired with confocal microscopy (Leica SP8) by the observer blinded 
to the identity of the slides. The green and red single channel images 
were analysed by ImageJ to quantify green and red punctain the same 
way as in the GFP-LC3 assay described above. 

In Fig. 4c, d, representative confocal microscopy images (scale bar, 
10 um) and quantifications of the co-localization between HT Texon1- 
MBP-His (red, detected by anti-His immunofluorescence) and LC3B- 
GFP (green, detected by GFP fluorescence directly) in transiently 
transfected HeLa cells (Fig. 4c) or between endogenous mHTT and 
LC3-Il in the HD-knock-in mouse striatal cells (STHdh@”) (Fig. 4d). 
For overexpressed proteins (Fig. 4c), the cells transfected with LC3B- 
GFP alone or HT Texon1-MBP-His alone were imaged in both channels 
to ensure the specificity of the signals (top). The white arrows indicate 
representative co-localization puncta. Parts of the images have been 
magnified to show co-localization puncta more clearly (indicated by 
orange arrows). Since the puncta were obvious, co-localization was 
analysed by counting the red*green* (yellow) and the total red* puncta 
directly, and then calculating the ratio for each cell. Blind analysis was 
performed for quantifications. For endogenous proteins (Fig. 4d), 
mHTT was detected by the anti-HTT antibody 2166, and the endogenous 
LC3-II was detected by an anti-LC3 antibody that has been reported 
to specifically detect LC3-II*”. Since the signals of endogenous pro- 
teins were more dispersed, the co-localization analysis was performed 
blindly by measuring the red*green* (yellow) and the total red* pixels 
using ImageJ, and then calculate the ratio for each cell. 


Detection of long-lived proteins by click-chemistry 

As an indicator of autophagy activity, the degradation of long-lived 
proteins was measured similarly as previously reported”. In brief, the 
HeLa cells with 70-80% confluency in a 6-well plate were washed with 
warm PBS and cultured in Met-free DMEM (Thermo Fisher Scientific, 
cat. no.21013) added with dialysed FBS for 1h to deplete intracellular 
free Met reserves. The Met analogue L-AHA (50 pM) was then added 
to label the proteins for 18 h. After labelling, the cells were washed 
with PBS and cultured in regular culture medium containing 10x L-Met 
(2 mM) for 2 hto chase out short-lived proteins. The cells were then 
treated with the compounds versus the DMSO controls for 6 h before 
cell lysis and protein extraction. For the starvation sample, the culture 
medium was replaced with EBSS (Thermo Fisher Scientific, cat. no. 
24010043) for 6h. The protein lysates were then used for the click 
reaction by the Click-it reaction kit (Click Chemistry tools, cat. no. 
C1001) following manufacturer’s instructions, and the remaining L-AHA 
containing long-lived proteins were then conjugated with biotin. These 
proteins were then analysed by electrophoresis and detected by the 
HRP-conjugated streptavidin (Beyotime, cat. no. A0303). 


Immunofluorescence and caspase-3 imaging 

For immunofluorescence of cultured cells, cells were fixed in 4% PFA 
for 10 min after washing with 1x PBS three times, and then washing and 
permeabilized in 0.5% (vol/vol) TritonX-100 for 10 min. The cells were 


then blocked in blocking buffer (4% BSA + 0.1% (vol/vol) Triton X-100 
in1x PBS) for 30 min and incubated overnight at 4 °C with primary anti- 
bodies, andthen washed three times with blocking buffer and incubated 
with secondary antibody at room temperature for 1h. Coverslips were 
then washed three times, stained with 0.5 mg/ml DAPI for 5 min at room 
temperature, and then mounted in vectashield mounting medium (Vec- 
tor, cat.no. H-1002). Images were taken by Zeiss Axio Vert Al confocal 
microscopes and analysed blindly by ImageJ for co-localization and 
TUBB3 quantifications. For co-localization experiments of transfected 
HeLa cells (Fig. 4c), the GFP signals were used to detect GFP-LC3B, and 
anti-His was used to detect HT Texon1-MBP-His proteins. Empty vector 
transfected cells were imaged to ensure the specificity of the signals. 
The co-localization was analysed by calculating the ratio between over- 
lapping puncta andthe HTT (red) puncta for each cell, andthe puncta 
numbers were counted blindly. For co-localization experiments of 
STHdh2/24 cells, the endogenous mHTT protein was stained with 
the HTT antibody (Millipore, cat.no. MAB2166), and autophagosomes 
were stained with the LC3B antibody (Thermo Fisher Scientific, cat. 
no. 700712), which preferentially detects LC3-II**. The co-localization 
was analysed by Image] to calculate the ratio between overlapping 
pixels and the HTT (red) positive pixels, because the signals of the 
endogenous proteins were more dispersed and could not be counted 
accurately. For TUBB3, the total area of TUBB3 signals and the DAPI 
counts were analysed by ImageJ. The former is then divided by the latter 
to calculate the averaged area of TUBB3 in each neuronas an index for 
neurodegeneration in vitro. 

For caspase-3 activity measurements of the iPS-cell-derived neu- 
rons, the NucView 488 caspase-3 dye (Biotium, cat. no. 30029) was 
used for the caspase 3 activity detection as an indicator for apoptosis. 
The images were then taken every 3 h using the Incucyte technology 
(Essen Bioscience, IncuCyte FLR), which takes images of 4 different 
fields in each well inside the cell culture incubator. The quantification 
was performed by the Incucyte 2011A software, which identified the 
green fluorescent puncta and quantified the fluorescent object count 
per field. The 4 fields per well were quantified and averaged, and 4 
independent wells were used for statistical analysis. 


Antibodies 

Antibodies used for western blots, HTRF and/or immunofluores- 
cence/immunohistochemistry are as follows: the HTT antibodies 
2B7%, ab1?? and MW1”° have been described previously; commercially 
purchased antibodies include HTT antibody 2166 (Millipore, cat. no. 
MAB2166), anti-polyQ antibody 3B5H10 (Sigma, cat. no. P1874), anti- 
HTT antibody (D7F7)XP (Cell Signaling Technologies, cat. no. 5656 
s), anti-B-tubulin (Abcam, cat. no. ab6046), anti-TUBB3 (Biolegends 
(previously Covance), cat. no. 801202), anti-ATXN3 (Millipore, cat. 
no. MAB5360); anti-Gapdh (Proteintech, cat. no. 60004-1), anti-NBR1 
(Thermo Fisher Scientific, cat. no. PAS-54660), anti-B-actin (Beyotime, 
cat. no. AA128); anti-TBP (Abcam, cat. no. ab818); anti-P62 (Thermo 
Fisher Scientific, cat. no. PAS-27247); anti-spectrin (Millipore, cat. no. 
MAB1622); anti-Ncoa4 (Santa cruz, cat.no. sc-373739); anti-GST (Pro- 
teinTech, cat.no. HRP-66001); anti-GFP (Cell Signaling Technologies, 
cat. no. 2956); anti-MBP (ProteinTech, cat. no. 15089-1-AP); anti-His 
(Beyotime, cat. no. AH367); anti-BUBR1 (BD Transduction, cat.no, 
612503); anti-phospho-p44/42 MAPK (ERK1/2) and anti-phospho- 
MEK1/2 in the Phospho-Erk1/2 Pathway Sampler Kit (Cell Signaling 
Technology, cat.no. 9911); anti-LC3B (Thermo Fisher Scientific, cat. 
no. PA1-16930 (for western blot) and cat. no. 700712 (for immuno- 
fluorescence)). All the antibodies used for immunofluorescence in 
this study have been validated by knock-down experiments. All the 
HTT, polyQ and ATXN3 antibodies used for HTRF and/or western blots 
have been validated by knock-down experiments and by compar- 
ing the signals from different genotypes in previous studies from us 
and others. All the other antibodies have been validated by previous 
literature or the vendor. 


Compound detection in vivo in brain tissue from ip-injected mice 
The experiments were performed by the SIM-Servier joint laboratory. 
The mice, ip-injected with DMSO or the indicated compounds, were 
anesthetized by chloral hydrate (200 pL/kg of 10% stock) at indicated 
time points, and the heart blood was collected by vacuum blood collec- 
tion tubes. The heart blood samples were further spun at 10,000 r.p.m. 
for 5 min to generate the heart plasma. The mice were then perfused 
with 1x PBS to remove the blood. The mice were then euthanized and 
the brain samples were dissected. Five times the volume of methanol: 
acetonitrile (50: 50, vol/vol) were added to each sample, which was then 
homogenized. Following ultrasonic treatment for 15 min, the homoge- 
nates were centrifuged for 5 min, then 20 pL supernatant liquid was 
mixed with 20 uL water for 30 s before injection. Linear range of 10OS 
was 10-30,000 ng/mL, and the linear range of AN2 was 0.3-10,000 
ng/mL. The LC-MS/MS analyses were performed on an Acquity ultra 
performance liquid chromatography (UPLC) system (Waters Corpora- 
tion) coupled toa Xevo TQ-S mass spectrometer (Waters Corporation). 
Chromatographic separation was performed using an Acquity UPLC 
BEH C18 (1.7 pm 2.1 x 50 mm) column supplied by Waters at a flow of 
0.5 mL/min. Gradient elution was used with a mobile phase composed of 
solvent A (water containing 0.1% formic acid and 5 mM NH, AC) and sol- 
vent B (acetonitrile: methanol (9:1,vol/vol) containing 0.1% formic acid). 


Proteomics analysis 

Samples were analysed on Orbitrap Fusion Lumos mass spectrometers 
(Thermo Fisher Scientific) coupled with an Easy-nLC 1000 nanoflow 
LC system (Thermo Fisher Scientific). Dried peptide samples were 
re-dissolved in Solvent A (0.1% formic acid in water) and loaded to atrap 
column (100 um x 2cm; particle size, 3 1m; pore size, 120 A; SunChrom) 
with a max pressure of 280 bar using Solvent A, then separated on a150 
uum x 15cm silica microcolumn (particle size, 1.9 1m; pore size, 120 A; 
SunChrom) witha gradient of 5-35% mobile phase B (acetonitrileand 0.1% 
formic acid) ata flowrate of 600 nl min“ for 75 min. The FAIMS device was 
placed before the mass spectrometer. FAIMS separation was performed 
with the following settings: inner electrode temperature = 100 °C, 
outer electrode temperature = 100 °C, carrier gas flow = 4.6 | min, 
dispersion voltage = —5,000 V, entrance plate voltage = 250 V. The 
FAIMS carrier gas is N, only. The noted CVs were applied to the FAIMS 
electrodes. Each of the selected CVs was applied to sequential survey 
scans and MS/MS cycles (1s); the MS/MS CV was always paired with the 
appropriate CV from the corresponding survey scan. For detection with 
Fusion or Fusion Lumos mass spectrometry, a precursor scan was car- 
ried out in the Orbitrap by scanning m/z300-1400 witha resolution of 
120,000. The most intense ions selected under top-speed mode were 
isolated in Quadrupole with a1.6 m/z window and fragmented by higher 
energy collisional dissociation (HCD) with normalized collision energy 
of 30%, then measured in the linear ion trap using the rapid ion trap 
scan rate. Automatic gain control targets were 5 x 10° ions with a max 
injection time of 50 ms for full scans and 1 x 10‘ with 35 ms for MS/MS 
scans. Dynamic exclusion time was set at 18 s. Data were acquired using 
the Xcalibur software (Thermo Scientific). 

Raw files were searched against the human National Center for Bio- 
technology Information (NCBI) Refseq protein database (updated 
on 04-07-2013, 32,015 entries) by Mascot 2.3 (Matrix Science) imple- 
mented on Proteome Discoverer 2.2 (Thermo Scientific). The mass 
tolerances were 20 ppm for precursor and 0.5 Da for product ions for 
Fusion Lumos. Up to two missed cleavages were allowed. The search 
engine set cysteine carbamidomethylation as a fixed modification 
and N-acetylation, oxidation of methionine as variable modifications. 
Precursor ion score charges were limited to +2, +3, and +4. The data 
were also searched against a decoy database so that protein identifica- 
tions were accepted at a false discovery rate of 1%. Label-free protein 
quantifications were calculated using a label-free, intensity-based 
absolute quantification (iBAQ) approach. 


Proteins with at least 2 unique peptides with 1% FDR at the peptide 
level and Mascot ion score greater than 20 were selected for further 
analysis. The file used for protein inference and protein FDR calcula- 
tion was derived from Mascot search results, and the peptide spectrum 
match (PSM) was filtered via Percolator and customized parameters, 
and then the proteins were assembled. The protein FDR was calculated 
depending on the ratio of NPD (the number of assembled proteins from 
decoy database searches) and NPT (the number of assembled proteins 
from target database searches). The FOT was used to represent the 
normalized abundance of a particular protein across samples. FOT was 
defined as a protein’s iBAQ divided by the total iBAQ of all identified 
proteins within one sample. The FOT was multiplied by 10° for the ease 
of presentation. Only the proteins detection in all compared samples 
were used for comparison. 


Behavioural and lifespan experiments in HD Drosophila models 
For behavioural experiments, we placed 15 age-matched virgin female 
flies in an empty vial and tapped them down. The percentage of flies 
that climbed past a 7-cm-high line after 15 s was recorded. The mean 
of five observations is plotted for each vial on each day, and data from 
multiple vials containing different batches of flies were plotted and 
analysed by two-way ANOVA tests. The flies were randomly placed into 
each tube. For lifespan measurements, we placed 75 age-matched virgin 
female flies in an empty plastic vial and recorded the survival situation 
for each vial on each day. For both behavioural and lifespan measure- 
ment experiments, the person who performed the experiments were 
blinded to the drugs fed until data analysis. 


Behavioural experiments in HD mouse models 

Allthe behavioural experiments were performed during the light phase 
and the experimenters were blinded to the compound treatment and 
the genotype of each mouse. Both males and females were used. All 
the mice were kept in the behavioural test room in dim red light for 
1h before starting the experiments. For rotarod experiments, mice 
were pre-trained on 3 consecutive days on the rotarod rotating at 
4r.p.m. for 2 min. Mice were then tested for five days at an accelerating 
speed ranging from 4 to 40r.p.m. within 2 min. Each performance was 
recorded as the time in seconds spent on the rotating rod until falling 
off or until the end of the task. Each test included three repetitions with 
an inter-trial interval of 60 min in order to reduce stress and fatigue, 
and the means from these three runs were analysed for each mouse. 
The balance beam test was run using a 2-cm-thick metre stick sus- 
pended froma platform on both sides by metal grips. The total lengthis 
100 cm. There was a bright light at the starting point and a dark box with 
food at the endpoint. The total time for each mouse to walk through 
the beam was recorded. For gripping force measurements, mice were 
allowed to grip the metal grids of a grip meter (Ametek Chatillon) with 
their forelimbs, and they were gently pulled backwards by the tail until 
they could no longer hold the grids. The peak grip strength observed 
in 10 trials was recorded. 


Statistics 

To ensure to reach a statistical power >0.8, power analyses were per- 
formed for each assay based on estimated values by PASS 16 (https:// 
www.ncss.com/software/pass/) before experiments. Estimation was 
based on our previously published results on similar experiments and 
preliminary experiments. The effect size was also estimated by Cohen's 
d, two means divided by the standard deviation for the data. The power 
analysis suggested n > 3 for mHTT level measurements and n= 5 for 
behavioural experiments. In all the experiments we performed, we have 
used a larger n than these numbers in case the effect was smaller than 
preliminary results, and we also performed post-experiment power 
analyses to ensure that power = 0.8 for all the significant differences. 
Statistical comparisons between two groups were conducted by the 
unpaired two-tailed t-tests. Statistical comparisons among multiple 
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groups were conducted by one-way ANOVA tests and post hoc tests 
for the indicated comparisons (Dunnett’s tests for comparison with 
a single control, and Bonferroni’s tests for comparisons among dif- 
ferent groups). Statistical comparisons for series of data collected at 
different time points were conducted by two-way ANOVA tests. The 
similarity of variances between groups to be compared was tested 
when performing statistics in GraphPad Prism 7 and Microsoft Excel 
2016. Normality of data sets was assumed for ANOVA and t-tests, and 
was tested by Shapiro-Wilk tests. When the data were significantly 
different from normal distribution, nonparametric tests were used for 
statistical analysis. All statistical tests were unpaired and two-tailed. 
For the in vivo experiments in the mouse, randomization was per- 
formed by assigning random numbers. For the Drosophila experiments, 
the flies were randomly distributed into the vials after anesthesia. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


The protein structure data has been uploaded to the Protein Data Bank 
with accession number 6J04. Source data for all figure plots are pro- 
vided with the paper. The full gel blots and the proteomics data sets 
have been provided in the Supplementary Information. The data that 
support the findings of this study are available from the corresponding 
authors upon reasonable request. 
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Extended Data Fig. 1| Protein purifications. a, SDS-PAGE and linear mode 
MALDI-TOF mass spectrometry analysis of the expression and purification of 
recombinant LC3B protein. Left, SDS-PAGE: lane 1, the whole cell lysate before 
induction; lane 2, the whole cell lysate after induction; lane 3, the supernatants 
of induced cells; lane 4, the flow through fraction of Ni-NTA chromatography; 
lane 5, the eluates of Ni-NTA chromatography (GST-His,-LC3); lane 6, LC3B 
eluate after removal of GST-His, tag by TEV protease; lane 7, the eluates of size- 
exclusion chromatography; lane 8, molecular weight marker. Right, m/z peak of 
recombinant LC3B is 14,660.811, consistent with theoretical calculations. 

b, Structural alignment of purified recombinant LC3B(AG120) (PDB ID: 6J04, 
yellow) with published LC3B structure (PDB ID: 1UGM, cyan) by PyMOL. 

c,d, SDS-PAGE and linear mode MALDI-TOF mass spectrometry analysis of the 
HTTexon1 proteins. c, Left, SDS-PAGE for HT Texon1(Q72)-MBP: lane 1, the 
supernatants of induced cells; lane 2, the insoluble fraction of induced cells; 


lane 3, the flow through fraction of Ni-NTA chromatography; lane 4, the eluates 
of Ni-NTA chromatography; lane 5, the eluates of size-exclusion 
chromatography; lane 6, molecular weight marker. d, Left, SDS-PAGE for 
HTTexon1(Q25)-MBP: lane 1, molecular weight marker; lane 2, the induced cell 
lysate; lane 3, the supernatant fraction of induced cells; lane 4, the flow- 
through fraction of Ni-NTA chromatography; lanes 5 and 6, the eluates of Ni- 
NTAchromatography; lanes 7 and 8, the eluates of size-exclusion 
chromatography. The m/z peaks of HT Texon1(Q72)-MBP (c, right) and 
HTTexon1(Q25)-MBP (d, right) are 64,225.946 and 58,228.893, consistent with 
theoretical calculations. e, Left and middle, size-exclusion chromatography of 
the recombinant full-length HT T(Q73) (fIHTT-Q73) and HT T(Q23) (fIHTT-Q23) 
proteins using Superose 6 5/150 GL. The major peak fractions were collected 
pooled together for the SDS-PAGE analysis (right). f, SDS-PAGE analysis of 
purified MBP-His, (MBP), sfGFP (GFP) and Rpni0 proteins. 
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Extended Data Fig. 2|See next page for caption. 


Extended Data Fig. 2| Negative controls for OIl-RD measurements and 
validation of the compounds’ interaction with HTT and LC3 by MST. 

a, Similar to Fig. 1c—e, but for negative control proteins MBP-His, (MBP), sfGFP 
and Rpni0 (Rpn10). Association-dissociation curves of surface immobilized 
compounds 8F20 and 1005 with these proteins were measured by OI-RD, and 
no compound-protein interactions were detected. For all association- 
dissociation curves, vertical dashed lines mark the starts of association and 
dissociation phases of the binding event. b, Binding of 1005 and 8F20 to full- 
length HT T(Q73) (fIHTT(Q73), black dots) or LC3B (red dots) instandard 
treated capillaries measured by MST. The compound-bound protein fractions 
(bound/total) were calculated from the MST signals (F,,4,m) at each compound 
concentration, as well as the bound (Form pouna Set as 100%) and the unbound 
(Frorm unbound Set as 0%) MST signals: bound/total = (Fnorm — Fnorm.unbound)/ 
(Frormbound ~ Fnorm.unbound) X 100%. The fitted curves and K, values calculated by 
Nanotemper analysis software (v.1.5.41) for fIHTT(Q73) and LC3B are indicated 
in each panel. Consistent with the OI-RD measurements (Fig. le), no binding 
was observed for the fIHTT(Q23) protein (blue dots). The MST experiments 


were repeated more than three times and showed consistent results. c, Similar 
to b, except using the compounds indicated on the x axis. MST measurements 
of the binding of indicated compounds to full-length HT T(Q73) (fIHTT-Q73), 
full-length HTT(Q23) (fIHTT-Q23) and LC3B in standard treated capillaries. The 
proteins tested are indicated in the legends. d, Similar to Fig. 1c-e, but plotting 
the association-dissociation curves of surface immobilized compound AN2 
with full-length HT T(Q73) (Q73), or full-length HT T(Q23) (Q23), LC3B or the 
negative-control proteins MBP-His, (MBP), sfGFP and Rpn10. For all 
association-dissociation curves, vertical dashed lines mark the starts of 
association and dissociation phases of the binding event. The red dashed lines 
are global fits toa Langmuir reaction model with the global fitting parameters 
listed at the bottom of each plot. No binding signals were observed for full- 
length HTT(Q23) proteins, and thus the parameters were not presented. e, Cell 
viability measurement of cultured HD neurons measured by the CellTiter-glo 
assay. No toxicity was observed within the concentration range presented in 
Fig. 2, although the compound 8F20 became toxic to the cells when the 
concentration reached 300 nM. 
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Extended Data Fig. 3 | See next page for caption. 
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Extended Data Fig. 3 | mHTT-lowering effects by mHTT-linker compounds 
could be detected by multiple antibodies and were dependent on 
autophagy. a, Representative western blots (HTT detected by the 2166 
antibody) and quantifications of compound-treated cultured cortical neurons 
from Hdh®?72° HD-knock-in mice. The neurons were treated with the indicated 
compounds (100 nM for 1005, 8F20 and ANI; 50 nM for AN2) with or without 
the autophagy inhibitor NH,Cl (top) or chloroquine (bottom left), or the 
autophagy activator rapamycin (bottom right). The same amount of culture 
medium was added in the controls (top). The statistical analysis was performed 
by one-way ANOVA with post hoc Dunnett’s tests, and the F, degree of freedom 
and post hoc Pvalues are indicated in each bar plot. b, Western blots using 
indicated HTT or polyQ antibodies for samples from cultured cortical neurons 
treated with the indicated compounds: 1005 (100 nM), 8F20 (100 nM) or AN2 
(SO nM). TheHTT gel blots presented in Fig. 2d (right) were cropped from first 
four blots. The low molecular weight bands were run out in these blots so that 
the wtHTT and mHTT could be better separated. Note that the weak bands just 
above 250 kDain the first two blots were leftover signals from the spectrin 
blotting. The spectrin signals were too strong to be stripped completely. 

c, Western blots using the antibody MW1 or 3B5H10, which detects mHTT 
specifically. We ensured that the relatively low-molecular-weight proteins did 
notrun out of the gels. No increase of potential polyQ-containing mMHTT 
N-terminal fragments was observed. d, iPS-cell-derived striatal neurons froma 
patient with HD (Q47) were treated with the indicated compounds (100 nM, 
with 0.1% DMSO) in presence of an additional 0.1% DMSO or 10 mM NH, Cl, and 


the mHTT levels were measured by HTRF using the 2B7/MW1 antibody pair. All 
signals were normalized to the averaged signals fromthe DMSO control group. 
The statistical analysis was performed by one-way ANOVA with post hoc 
Dunnett’s tests, and F, degree of freedom and post hoc Pvalues are indicated in 
each bar plot. ****P< 0.0001. The post hoc analysis was not performed if the 
ANOVA tests did not show significance (P> 0.05). e, Immortalized fibroblasts 
froma patient with HD (Q47) were transfected with the non-targeting control 
siRNA (Neg siRNA) or the ATGS siRNA (target sequence, 
GCCUGUAUGUACUGCUUUA; ATGS mRNA was knocked down to 17.7 +3.0%, 
n=3,as tested by reverse transcription with quantitative PCR), and then 
treated after 24 h with the indicated compounds (100 nM) fora further 48 h. 
mHTT levels were then measured by HTRF using the 2B7/MW1antibody pair. All 
signals were normalized to the averaged signals fromthe DMSO control group. 
The statistical analysis was performed by one-way ANOVA with post hoc 
Dunnett’s tests, and F, degree of freedom and post hoc Pvalues are indicated in 
each bar plot. ****P< 0.0001. The post hoc analysis was not performed ifthe 
ANOVA tests did not show significance (P> 0.05). The western blot of LC3 
confirmed the partial inhibition of autophagy in the ATGS-knockdown cells. 

f, Similar to e, but in wild-type (WT) or Atg5-knockout (Atg5 KO) mouse 
embryonic fibroblast lines (MEF) transfected with full-length mHTT 

(fIHT T-Q73). The western blot of LC3 confirmed the inhibition of autophagy in 
the Atg5-KO cells. For all panels, nindicates the number of independently 
plated wells, and bars represent mean ands.e.m. Full-blots of cropped gels are 
shownin Supplementary Fig. 1. 
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Extended Data Fig. 4 | Potential influence on c-Raf and KSP pathways 
following treatment with the mHTT-LC3 linker compounds. 

a, Representative results (from three biological repeats) of the in vitro c-Raf 
kinase assay (see Methods) showing that only 1005 inhibits c-Raf activity within 
the concentration range tested. b, Representative western blots and 
quantifications of phospho-MEK and phospho-ERK as indicators of Raf activity 
(left) and phospho-BUBR1as an indicator of KSP inhibition (right) in cultured 
cortical neurons treated with indicated compounds (100 nM for 1005, 8F20, 
AN1, and 50 nM for AN2) or the DMSO control.c, Similar to b, butin 
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immortalized fibroblasts froma patient with HD (Q47). Note that phospho- 
BUBR1is essentially absent and too weak to quantify, indicating that KSP was 
not inhibited by any of the compoundsat the concentration tested. Dataare 
mean +s.e.m.Inb,c, all data were corrected by the loading control (B-tubulin) 
and normalized to the averaged signal of the DMSO control group. The 
statistical analysis was performed by one-way ANOVA and F, degree of freedom 
and post hoc Pvalues are indicated in each bar plot. Then number indicates the 
number of independently plated and treated wells. 
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Extended Data Fig. 5 | mHTT-LC3 linker compounds lowered mHTT in statistical analysis was performed by one-way ANOVA and Dunnett’s post hoc 
transgenic HD flies. a, Overlay between LC3B and predicted Atg8 structure tests. F(4, 31) = 15.67; ****P< 0.0001. c,1005 (top) and AN2 (bottom) 
showing high structural similarities. b, Transgenic flies expressing full-length concentrations in heart plasma and brain tissues were measured by mass 
HTT(Q128) driven by elav-GAL4 were fed with indicated compounds at 10 uM spectrometry at the indicated time points for compound-injected mice 
for 6 days, and protein lysates were extracted from the heads. mHTT was then (0.5 mg kg”). In the brain tissue, the 1005 concentrations were -20 to -200 nM, 
measured by HTRF using the 2B7/MW1 antibody pair. Each dot represents the and the AN2 concentrations were -20 to -40 nM, close to the effective doses 
HTRF signal from each individual sample extracted from five fly heads. All the that were capable of lowering mHTT in cultured neurons. Data are mean+ 


data were normalized to the average of the DMSO-fed control samples. The s.e.m. 
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Extended Data Fig. 6 | mHTT-LC3 linker compounds lowered mHTT in vivoin 
mouse brains. a, Western blots (4 mice (3 months old) for each group) and 
quantifications of MHTT and wtHTT in the cortices from Hdh?’2°-knock-in 
mice with intracerebroventricular injection of the indicated compounds (2 pl 
at 25 uM for each mouse) for 10 days at one dose per day. HTT was detected by 
western blot using the 2166 antibody, and the statistical analysis was 
performed by one-way ANOVA and post hoc Dunnett’s tests. F, degree of 
freedomand post hoc Pvalues are indicated below each bar plot. b, Similar toa, 
except that the compounds were delivered to 5-month-old Hdh®’2”? mice by 
intraperitoneal injection (0.5 mg kg“) for 14 days at one dose per day. c, Similar 
tob, but from striata of intraperitoneally injected mice. The mice were injected 
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at 10 months old for 14 days at one dose per day. d, Left, representative dot blot 
results (from two technical replicates) of the protein lysates from b using the 
4C9 antibody, which preferentially detects mHTT aggregates”’. Middle, 
quantification of the dot blots based on the averaged signals from two 
technical replicates. Right, measurement of mHTT aggregates by the 4C9-4C9 
HTRF assay”*. In all panels, n indicates the number of mice tested, and bars 
represent mean ands.e.m. For quantification, two to three technical replicates 
were averaged for each mouse. Statistical analysis was performed by one-way 
ANOVA with post hoc Dunnett’s tests, and F, degree of freedom and post hoc 
Pvalues are indicated in each bar plot. 
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Extended Data Fig. 7 | See next page for caption. 
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Extended Data Fig. 7 | mHTT-LC3 linker compounds did not influence 
autophagy. a, HeLa cells stably expressing GFP-LC3B were treated with 2 pl 
vehicle (0.1% DMSO), 1005 or AN2 for the indicated concentration for 24 h; 
chloroquine (CQ, 20 pM) treatment was used asacontrol. After 24h, cells were 
fixed and images were acquired by confocal microscopy. The number and size 
of GFP vesicles per cell was determined using ImageJ software (nindicated on 
top of each plot). For each treatment, more than 20,000 puncta were 
quantified (-100 puncta per cell from 226 cells). Scale bar, 10 tm. 

b, Representative images and quantifications of the numbers of 
autophagosomes (GFP* puncta) and autolysosomes (RFP*GFP” puncta) in HeLa 
cells stably expressing mRFP-GFP-LC3B. Scale bar, 10 pm. Autophagosome 
numbers or sizes were not influenced by 1005 and AN2at the indicated 
concentrations after 24 htreatment (or 4h treatment, not shown). The 
autophagsome fusion was also unaffected as indicated by the autolysosome 
number. Note that the autophagosome and autolysosome numbers and sizes 
were based on image analysis of the puncta, some of which may represent 
multiple vesicles. Green vesicles are considered to be autophagosomes (GFP* 
puncta) and red vesicles are considered to be both autophagosomes and 
autolysosomes. The number of autolysosomes (RFP*GFP puncta) was 
calculated by subtracting the number of green vesicles from that of the red 
vesicles. More than 10,000 puncta from 194 cells were analysed.c, 


Representative western blots and quantifications of HeLa cells stably 
expressing GFP-LC3B. The ‘free GFP’ was generated by lysosomal cleavage, and 
thus the free GFP/GFP-LC3B ratio was used as an index for autophagy flux, 
which was unaffected by 1005 or AN2, but decreased by the autophagy flux 
inhibitor chloroquine. d, Representative western blots and quantifications of 
the chase signal of long-lived proteins in HeLa cells as an indicator of autophagy 
flux (see Methods). Consistent with previous reports”, starvation reduced the 
long-lived protein chase signal, whereas rapamycin treatment hada milder 
effect. The mHT T-LC3 linker compounds 1005 and AN2 had no influence in this 
assay. e, Representative western blots and quantifications of LC3 in cultured 
cortical neurons treated with the indicated compounds. Normalized LC3-II/ 
LC3-I was used as the indicator of autophagy. Right blot: 1005, 100 nM; AN2,50 
nM. f, SQSTM1 (p62) levels were determined by western blot for the cortical 
tissues from mice injected with the indicated compounds or DMSO control. 
Bars indicate mean ands.e.m.; nindicated in each bar shows the number of cells 
(a, b), the number of independently plated wells (c—e) or the number of mice (f). 
Data are mean +s.e.m. The statistical analysis was performed by one-way 
ANOVA with post hoc Dunnett’s tests (a—e) or two-tailed unpaired ¢-tests (f). 
Note that the post hoc tests were not performed if the ANOVA tests failed to 
show significance. ****P< 0.0001 (post hoc test). 
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Extended Data Fig. 8 | Investigation on the specificity of mHTT-lowering 
effects of mHTT-LC3 linker compounds. a, Representative western blots and 
quantifications of cultured cortical neurons treated with the indicated 
compounds. None of the proteins tested showed a clear effect (>10%). 
b, Volcano plots of the proteomics analysis of cortices from intraperitoneally 
injected HD mice (10 month old, 4 mice per group, injected for 14 days). Mice 
were injected with 0.5 mg kg" protein with 110 pg kg DMSO, and equal amount 
of vehicle containing DMSO was injected in the control mice. Only proteins 


detected in both groups of samples used for comparisons were calculated and 
plotted. Red arrows indicate HTT. See Supplementary Table 2 for complete 
datasets. The bar plots indicate the total HTT levels normalized to the DMSO 
control. The actual mHTT reduction is anticipated to be higher, because the 
compounds reduced mHTT in anallele-selective manner. c, Similar to b, butin 
cultured cortical neurons (from postnatal day O pups, three wells per group). 
See Supplementary Table 3 for complete datasets. Inall panels, data are 
mean+s.e.m. 
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Extended Data Fig. 9 | See next page for caption. 


Extended Data Fig. 9 | mHTT-LC3 linker compounds lowered the mutant 
ATXN3 protein with polyQ expansion in an allele-selective manner. 

a, Representative western blots and quantifications of ATXN3 levelsina 
fibroblast line froma patient with SCA3 treated with the indicated compounds. 
The lowering of mutant (Q74) but not wild-type (Q27) ATXN3 was observed by 
treatment of linker compounds tested. b, Quantification of the GFP intensity as 
an indicator of polyQ-sfGFP (25Q-GFP, 38Q-GFP, 46Q-GFP and 72Q-GFP) 
protein levels in transfected HEK293T cells treated with the indicated 
compounds using Incucyte. Reduction of 72Q-GFP, 46Q-GFP and 38Q-GFP 
but not 25Q-GFP was observed. Inaandb, the compound concentrations were 
100 nM for 1005 and ANI, and 50 nM for AN2. Bar plots present mean +s.e.m., 
and nindicates the number of independently plated wells. c, SDS-PAGE 
analysis of polyQ-sfGFP proteins (25Q, 38Q,46Q, 53Q and 72Q) purified from 
HEK293T cells. The protein purification methods were similar to those for HTT 
proteins. d, Binding of 1OOS, AN1and AN2 to sfGFP (GFP) or different polyQ- 


sfGFP (25Q-GFP, 38Q-GFP and 72Q-GFP) proteins in standard treated 
capillaries measured by MST, performed and analysed similarly as in Extended 
Data Fig. 2b. All these compounds interact with 38Q-GFP and 72Q-GFP but not 
with 25Q-GFP or GFP. e, Association-dissociation curves of surface- 
immobilized compounds 1005, AN1and AN2 with polyQ-sfGFP (72Q, 53Q, 
46Q,38Q and 25Q) proteins. For all association-dissociation curves, vertical 
dashed lines mark the starts of association and dissociation phases of the 
binding event. The red dashed curves are fits toa Langmuir reaction model 
with the fitting parameters listed at the bottom of each plot. No binding signals 
were observed for 25Q-sfGFP (25Q). f-h, Results of mouse behavioural test 
performed similarly to those in Fig. Sd-f, except that the mice were injected 
with saline (0.9% NaCl) with DMSO (110 pg kg“) or without DMSO. The 
statistical analysis was performed by two-way ANOVA with post hoc 
Bonferroni's tests, and F, Pvalues and degrees of freedom are indicated inthe 
table below each plot. In all panels, data are mean+s.e.m. 
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Extended Data Table 1| Summary of data on mHTT lowering or rescue of HD-relevant phenotypes 


Model and treatment 


Readout and figures 


Compound effects 


cultured primary cortical 
neurons, from mice (Hdh®7/2"40) 


the mMHTT level by Western-blot (Fig. 
2a&d) 


1005: 26.0+3.3% lowering 


8F20: 40.1412.6% lowering 


AN1: 35.7+2.8% lowering 


AN2: 34.0+6.2% lowering 


primary human HD patient 
fibroblasts (Q49) 


the mHTT level by HTRF (Fig. 3a) 


1005: 45.144.0% lowering 


8F20: 44.8+4.9 lowering 


AN1: 46.1+6.2% lowering 


AN2: 54.8+7.5% lowering 


primary human HD patient 
fibroblasts (Q55) 


the mHTT level by HTRF (Fig. 3a) 


1005: 34.3+5.4% lowering 


8F20: 28.7+3.6% lowering 


AN1: 26.5+7.0% lowering 


AN2: 39.3+4.8% lowering 


primary human HD patient 
fibroblasts (Q68) 


the mHTT level by HTRF (Fig. 3a) 


1005: 20.9+2.7% lowering 


8F20: 22.9+5.3% lowering 
AN1: 26.4+2.8% lowering 


AN2: 18.1+5.1% lowering 


HD patient iPSC-derived 
neurons (Q47) 


the mHTT level by HTRF (Ext. Data. 
Fig. 5c) 


1005: 31.4+3.2% lowering 


8F20: 28.9+3.7% lowering 


AN1: 39.343.2% lowering 


AN2: 40.5+2.8% lowering 


surface area of each neuron by Tuj1 
staining (Fig. 5a) 


1005: 69.5+1.6% rescue 


8F20: 51.643.1% rescue 


AN1: 58.844.9% rescue 
AN2: 64.442.7% rescue 


immortalized human HD patient 
fibroblasts (Q47) 


the mHTT level by HTRF (Fig. 3b) 


1005: 30.24+4.5% lowering 


8F20: 22+4.8% lowering 


AN1: 42.0+3.9% lowering 
AN2: 41.4+5.1% lowering 


icv-injected mice (Hdh27/2"4°) 


the mHTT level by Western-blot (Ext. 
Data Fig. 9a) 


1005: 43.342.2% lowering 


8F20: 9.145.3% lowering (n.s.) 


AN1: 29.9+2.9% lowering 


AN2: 30.3+7.4% lowering 


ip-injected mice (Hdh27214°) 


the cortical mHTT by Western-blot 
(Ext. Data Fig. 9b) 


1005: 24.8+4.2% lowering 


AN2: 36.6+7.4% lowering 


the striatal mHTT by Western-blot 
(Ext. Data Fig. 9c) 


1005: 22.942.3% lowering 


AN2: 26.3+5.5% lowering 


the cortical HTT by MASS-SPEC 
(Ext. Data Fig. 11b) 


1005: 18.14+2.4% lowering 


AN2: 25.2+3.2% lowering 


latency to fall by rotarod tests (Fig. 
5d) 


1005: (60.8% averaged rescue) 


AN2: (64.3% averaged rescue) 


passing time by balance beam tests 
(Fig. 5e) 


1005: (77.2% averaged rescue) 


AN2: (92.8% averaged rescue) 


gripping force tests (Fig. 5f) 


1005: (43.6% averaged rescue) 


AN2: (52.4% averaged rescue) 


Asummary table showing the percentage lowering of mHTT or HTT levels, and the percentage rescue of HD-relevant phenotypes (normalized to the difference between HD and wild type) in 
different HD models assayed by different approaches under optimal conditions. The corresponding data are indicated in the middle column. The percentage change/rescue is presented as 
mean +s.e.m. 
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Statistical parameters 


When statistical analyses are reported, confirm that the following items are present in the relevant location (e.g. figure legend, table legend, main 
text, or Methods section). 


n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


An indication of whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


l The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


A full description of the statistics including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND 
variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


l For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
Give P values as exact values whenever suitable. 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Clearly defined error bars 
State explicitly what error bars represent (e.g. SD, SE, Cl) 


Our web collection on statistics for biologists may be useful. 


Software and code 


Policy information about availability of computer code 


Data collection All data collection softwares came with the equipment utilized for experiments, including IncuCyte 2011A, MO. Control (NT.115), 
PerkinElmer EnVision Manager Version 1.13, HKL2000 V719, Image Lab Version 3.0, ZEN 2.3 


Data analysis The softwares utilized for analysis were all commercially available or could be downloaded from open source, including GraphPad Prism 
7, ImageJ 1.52a, Origin8, Microsoft Excel 2016, PASS 16, Nanotemper analysis (1.5.41) , PYMOL 2.2, HKL2000, Phaser 2.8, Mascot 2.3, 
PASS 16, IncuCyte 2011A, MO.Affinity Analysis (NT.115). 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers 
upon request. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 
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Data 


Policy information about availability of data 
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 


- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- A description of any restrictions on data availability 


The protein structure data has been uploaded to the PDB database with entry number 6J04. The source data in excel files have been provided for all essential plots 
in the figures. The full gel blots and the proteomics datasets have been provided as supplementary tables. All the other data are available from the authors upon 
request. 


Field-specific reporting 


Please select the best fit for your research. If you are not sure, read the appropriate sections before making your selection. 


DX] Life sciences [_] Behavioural & social sciences [_] Ecological, evolutionary & environmental sciences 


For a reference copy of the document with all sections, see nature.com/authors/policies/ReportingSummary-flat.pdf 


Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size To ensure to reach a statistical power>0.8, power analyses were performed for each assay based on estimated values by PASS 16 (https:// 
www.ncss.com/software/pass/) before experiments. Estimation was based on our previously published results on similar experiments and 
preliminary experiments. The power analysis suggested n > =3 for mMHTT level measurements and n > =5 for behavioral experiments. In all the 
experiments we performed, we have used a larger n than these numbers, and we also performed post-experiment power analyses to ensure 
that power > 0.8 for all the significant differences. 


Data exclusions Data were not excluded unless clear experimental failures occurred, including cell contamination, gel transfer failures and lack of signals in 
positive controls. The exclusion criteria were pre-established. 


Replication All experimental data was reliably reproduced in multiple independent experiments as indicated in the figure legends. The protein-compound 
interaction experiments and the HTT-lowering experiments have been replicated by at least two independent researchers. 


Randomization — For the in vivo experiments in the mouse, randomization was performed by assigning random numbers. For the Drosophila experiments, the 
flies were randomly distributed into the vials after anesthesia. 


Blinding As indicated in the figure legends, the investigators were blinded during data collection and analysis where possible, included 
immunofluorescence experiments, drug treatment experiments in mouse and fly models for HTT level measurements and behavioral assays. 


Reporting for specific materials, systems and methods 


Materials & experimental systems Methods 

n/a | Involved in the study n/a | Involved in the study 
Unique biological materials ChIP-seq 
Antibodies Flow cytometry 
Eukaryotic cell lines MRI-based neuroimaging 


Palaeontology 


Animals and other organisms 


Human research participants 


Unique biological materials 


Policy information about availability of materials 


Obtaining unique materials All unique materials (some of the patient fibroblast lines) are readily available from the authors. 
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Antibodies 


Antibodies used 


Validation 


Eukaryotic cell lines 


Anti-B-tubulin (Abcam, cat. no. ab6046, lot no. GR3209100-1, 1:10000) 

Anti-TUBB3 (Biolegends (Covance), cat. no. 801202, clone TUJ1, lot no. B2335555, 1:500) 
Anti-ATXN3 (Millipore, cat. no. MAB5360, clone no. 1H9, lot no. 3096481,1:1000): 

Anti-Gapdh (Proteintech, cat. no. 60004-1, lot no. 10004129, 1:5000) 

Anti-NBR1 (ThermoFisher Scientific, cat. no. PAS5-54660, lot no. $J2462971A, 0.4 ug/mL) 
Anti-B-actin (Beyotime, cat.no. AA128, clone no.AC-74, lot no. 031918180821, 1:1000) 

Anti-TBP antibody (Abcam, cat. no. ab818, clone no.1TBP18, lot no. GR315577-3, 1:1000) 
Anti-SQSTM1 antibody (ThermoFisher Scientific, cat.no. PAS-27247, lot no. SC2360851N, 1:1000) 
Anti-spectrin antibody (Millipore, cat.no. MAB1622, clone no. AA6, lot no. 2943221, 1:2000 
Anti-Ncoa4 antibody (Santa cruz, cat.no. sc-373739, clone no. C-4, 1:200) 

Anti-GST antibody (ProteinTech, cat.no. HRP-66001, clone no. 3G12B10, lot no. 20000091, 1:2000) 
anti-MBP antibody (ProteinTech, cat.no. 15089-1-AP, lot no. 00058716, 1:3000) 
Anti-His antibody (Beyotime, cat. no. AH367, clone no.AD1.1.10, lot no.011018180312, 1:100) 
Anti-LC3B antibody (ThermoFisher Scientific, cat.no.PA1-16930, lot no.T12629311, 1:1000; ThermoFisher Scientific, cat.no. 
700712, clone no. 2H30L32, lot no. 2086347, 1:200) 
Phospho-Erk1/2 Pathway Sampler Kit (Cell Signaling Technology cat.no.9911, 1:1000; Phospho-MEK1/2, clone no.41G9, lot 
no.18, Phospho-p44/42 MAPK, clone no.D13.14.4E, lot no.24) 

Anti-GFP (Cell Signaling Technologies, cat. no. 2956, clone no. D5.1, lot no. 4, 1:1000) 

Anti-BubR1 antibody (BD Transduction, cat. no. 612503, clone no. 9/BUBR1, 1:1000) 

Anti-Huntingtin Protein antibody 2166 (Millipore, cat. no. MAB2166, clone no.1HU-4C8, lot no. 2943221, 1:1000) 

Anti-polyQ antibody 3B5H10 (Sigma, cat. no. P1874, clone no. 3B5H10, lot no. 047M4820V, 1:1000) 

Anti-HTT antibody (D7F7)XP (Cell Signaling Technologies, cat. no. 5656s, clone no. D7F7, lot no. 4, 1:1000) 

The other HTT antibodies including 2B7, ab1 and MW1 were previously published and characterized by other groups, and they 
originally obtained from those groups. They were diluted at 1:1000 for Western-blots. 


The anti-B-tubulin antibody has been validated for Western-blots of both human and mouse samples by many previous 
publications (e.g. PMID: 28869595). The anti-TUBB3 antibody has been validated for immunocytochemistry of human sample by 
many previous publications (e.g. PMID: 30545851). The anti-ATXN3 antibody has been validated for Western-blots of both 
human and mouse samples by many several publications (e.g. PMID: 28180282). The anti-Gapdh antibody has been validated for 
Western-blots of mouse sample by many previous publications (e.g. PMID: 31091447). The anti-NBR1 antibody has been 
validated for Western-blots of mouse in antibodypedia (https://www.antibodypedia.com/gene/8116/NBR1/antibody/3592759/ 
PAS-54660). Anti-B-actin antibody has been validated for Western-blots of both human and mouse samples by many previous 
publications (e.g. PMID: 30459625). The anti-TBP antibody has been validated for Western-blots of both human and mouse 
samples by many previous publications (e.g. PMID: 28280206). The anti-SQSTM1 antibody has been validated for Western-blots 
of mouse sample by many previous publications (e.g. PMID: 28869595). The anti-spectrin antibody has been validated for 
Western-blots of both human and mouse samples by many previous publications (e.g. PMID: 28869595). The anti-Ncoa4 
antibody has been validated for Western-blots of both human and mouse samples by many previous publications (e.g. PMID: 
30630985). The anti-GST antibody and the anti-MBP antibody has been validated for in vitro pull-down and Western-blot by 
experimental data in this study (Fig 4a-b). Anti-His antibody has been validated for immunostaining by experimental data in this 
study (Fig 4c). The anti-LC3B antibody (ThermoFisher Scientific, cat.no. PA1-16930) has been validated for Western-blots of 
mouse sample by vender (https://www.thermofisher.com/cn/zh/antibody/product/LC3B-Antibody-Polyclonal/PA1-16930). The 
anti-LC3B antibody (ThermoFisher Scientific, ThermoFisher Scientific, cat.no. 700712) has been validated by previous 
publications for immunostaining in human and mouse cells (PMID: 29151587, 22622129). The anti-GFP antibody has been 
validated by previous literature (PMID: 31112137; PMID: 31067454; PMID: 30996031). The Phospho-Erk1/2 Pathway Sampler Kit 
have been validated for Western-blots of mouse sample by several previous publications (e.g. PMID: 29636449). The anti-BubR1 
antibody has been validated for Western-blots of both human and mouse samples by many previous publications (e.g. PMID: 
27528194). The anti-HTT antibody (D7F7)XP have been validated for Western-blots of both human and mouse samples by many 
previous publications.(e.g. PMID: 26863614, PMID: 23575829). The other HTT antibodies including 2166, 3B5H10, 2B7, ab1 and 
MW1 have been validated for Western-blots of both human and mouse samples by many previous publications (e.g. PMID: 
25738228). The antibody 2166 has also been validated for immunostaining experiments in mouse samples by this study (Fig. 4d). 


Policy information about cell lines 


Cell line source(s) 


Some of the primary patient fibroblasts were obtained from HD patients (Q47, Q49, Q55) and healthy sibling (WT, Q19) 
controls in a Mongolian Huntington’s disease family. The HD Q68 fibroblast line was obtained from Coriell Cell Repositories 
Camden, NJ, USA). The PD line was obtained from an idiopathic Parkinson’s disease patient, and the SCA3 line was obtained 
from a SCA3 patient with the ATXN3 expansion mutation (Q74). The studies were approved by The Ethic Community of 
nstitutes of Biomedical Sciences at Fudan University (#28) for obtaining the HD and wild-type patient fibroblasts, and by 
Huashan Hospital Institutional Review Board at Fudan University (#174) for obtaining the PD and SCA3 patient fibroblasts. 
Verbal and written consent was obtained from patients. The procedures were in compliance with all relevant ethical 
regulations. The immortalized fibroblasts were generated by infection of lentivirus expressing SV4OT. For generation of iPS 
cells (iPSCs), the primary fibroblasts were transduced with the retroviral STEMCCA polycistronic reprogramming system 
Millipore, cat. no. SCR548). The iPSCs were confirmed positive for Tra-1-81, Tra-1-60, SSEA-4 and Nanog by 
immunofluorescence and flow-cytometry. All four vector-encoded transgenes were found to be silenced and the karyotype 
was normal. iPSC were cultured in E8 medium (ThermoFisher Scientific, cat. no. A1517001) on Matrigel (Corning, cat. no. 
354277) surface. iPSCs were differentiated to Pax6-expressing primitive neuroepithelia (NE) for 10-12 days in a neural 
induction medium. Sonic hedgehog (SHH, 200 ng/ml) was added at days 10-25 to induce ventral progenitors. For neuronal 
differentiation, neural progenitor clusters were dissociated and placed onto poly-ornithine/laminin-coated coverslips at day 
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26 in Neurobasal medium (ThermoFisher Scientific, cat. no. 21103049), with 1x B-27 (ThermoFisher Scientific, cat. no. 
17504044), 1x N-2 (ThermoFisher Scientific, cat. no. 17504048), brain derived neurotrophic factor (BDNF, 20 ng/ml, Protech, 
cat. no. 450-02), glial-derived neurotrophic factor (GDNF, 10 ng/ml, Protech, cat. no. 450-10), insulin-like growth factor 1 
(IGF1, 10 ng/ml, Protech, cat. no. 100-11) and Vitamin C (Sigma cat. no. D-0260, 200 ng/ml). The mouse striatal cells (STHdh) 
were obtained from Coriell Cell Repositories (Camden, NJ, USA). The HEK293T cells and the HeLa cells were originally 
obtained from American Type Culture Collection (ATCC). AtgS WT and KO MEFs were from N. Mizushima. 


Authentication The HEK293T and HeLa cell lines were authenticated by Short Tandem Repeat (STR) profiling methods. The Atg5 WT and KO 
MEFs were obtained directly from the laboratory which generated these cell lines (N. Mizushima), and they were further 
authenticated by Short Tandem Repeat (STR) profiling methods comparing with primary cultured MEFs. The patient 
fibroblasts were obtained and cultured from patients, and they were not authenticated. 


Mycoplasma contamination The cells were tested every two months by a TransDetect PCR Mycroplasma Detection Kit (Transgen Biotech, cat. no. 
FM311-01) to ensure that they are mycoplasma free. 


Commonly misidentified lines HeLa cells were used in the HTT-LC3 colocalization experiments, because it is commonly used cell line for autophagy 
(See ICLAC register) experiments and it showed more distinct LC3 puncta than other cells that we have tested. In addition, it has high transfection 
efficiency. 
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Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals The fruitfly experiments used Drosophila Melanogaster, and adult virgin female flies were used for experiments at indicated days 
of age (ranging from 0 to 50 days old after eclosion). The mouse experiments used the C57BL/6 strain including both male and 
female of the desired genotype. For icv experiments, the age was 3 months +/- 0 days old; for ip-injection followed by testing of 
cortical HTT, the age was 5 months +/- 0 days old; for ip-injection followed by testing of striatal HTT, the age was 10 months +/- 3 
days old; for ip-injection followed by behavioral analysis, the age was 10 months +/- 3 days old. 


Wild animals The study did not involve wild animals. 


Field-collected samples The study did not involve samples collected from the field. 


Human research participants 


Policy information about studies involving human research participants 


Population characteristics Since we are testing the compounds in patient cells rather than patient groups, we do not have population characteristics of 
human participants. For each patient, we cultured many cells from them and treat different group of cells with different 
compounds to test their effects within each patient cell lines. 
In general, the patients were diagnosed base on symptoms and genetic testings, and they received no treatment at the time 
they provided dermal fibroblasts. The HD patients (Q47/Q19, 46-year-old female; Q49/Q19, 44-year-old male; Q55/Q19, 39- 
year-old male) and the healthy sibling control (Q19/Q19, 39-year-old female) were from a Mongolian family. The The SCA3 
patient was a 32 years old female patient when providing the dermal fibroblasts. She first came to the clinic complaining with 
clumsy and slowness in the lower limbs and left upper limb for 1 year. Her father and grandfather had the same symptoms but 

had passed away. After a one year follow up, she developed unstable walking. She was diagnosed as spinocerebellar ataxia and 

confirmed by genetic testing with the repeat number of Q74/Q27 in ATXN3. 

The PD patient was a 74 years old male patient when providing the dermal fibroblasts. He developed tremor, rigidity and 

bradykinesia for 6 years. The symptoms started at age 68 and he was diagnosed as PD at age 69 and followed up in our center 

for 5 years. A panel containing 254 PD and related genes and PD MLPA were carried out in the patient but did not find any 
nown mutations related to PD. DAT-PET CT found the decreased DAT binding in the right caudate nucleus and putamen. 


Recruitment The HD and SCA3 patients were recruited by clinical symptoms and confirmed with genetic testing. The PD patients were 
recruited by clinical symptoms and confirmed by follow-up visits of more than 5 years and DAT PECT-CT. The recruitment could 
be biased because only a few patients who see the collaborating doctor and want to donate dermal fibroblasts for potential 
future research were selected. This is typical for preclinical studies, and our study is comparing different compound treated 
groups within each of the cell line, and thus not influenced by patient-to-patient variations. Nonetheless, while we have tested 


multiple patient cells and obtained consistent results, it is still possible that some of the other patient cells show different results. 
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Metastatic cancer is a major cause of death and is associated with poor treatment 
efficacy. A better understanding of the characteristics of late-stage cancer is required 
to help adapt personalized treatments, reduce overtreatment and improve outcomes. 
Here we describe the largest, to our knowledge, pan-cancer study of metastatic solid 
tumour genomes, including whole-genome sequencing data for 2,520 pairs of tumour 
and normal tissue, analysed at median depths of 106x and 38x, respectively, and 
surveying more than 70 million somatic variants. The characteristic mutations of 
metastatic lesions varied widely, with mutations that reflect those of the primary 
tumour types, and with high rates of whole-genome duplication events (56%). 
Individual metastatic lesions were relatively homogeneous, with the vast majority 
(96%) of driver mutations being clonal and up to 80% of tumour-suppressor genes 
being inactivated bi-allelically by different mutational mechanisms. Although 
metastatic tumour genomes showed similar mutational landscape and driver genes to 
primary tumours, we find characteristics that could contribute to responsiveness to 
therapy or resistance in individual patients. We implement an approach for the review 
of clinically relevant associations and their potential for actionability. For 62% of 
patients, we identify genetic variants that may be used to stratify patients towards 
therapies that either have been approved or are in clinical trials. This demonstrates 
the importance of comprehensive genomic tumour profiling for precision medicine 


in cancer. 


In recent years, several large-scale whole-genome sequencing (WGS) 
analysis efforts have yielded valuable insights into the diversity of the 
molecular processes that drive different types of adult!” and paediat- 
ric>* cancer and have fuelled the promises of genome-driven oncology 
care’. However, most analyses were done on primary tumour material, 
whereas metastatic cancers—which cause the bulk of the disease burden 
and 90% of all cancer deaths—have been less comprehensively studied 
at the whole-genome level, with previous efforts focusing on tumour- 
specific cohorts*° or at a targeted gene panel? or exome level”. As 
cancer genomes evolve over time, both in the highly heterogeneous 
primary tumour mass andas disseminated metastatic cells”, a better 
understanding of metastatic cancer genomes will be highly valuable 
to improve on adapting treatments for late-stage cancers. 

Here we describe the pan-cancer whole-genome landscape of meta- 
static cancers based on 2,520 paired tumour (106 average depth) and 
normal (blood, 38x) genomes from 2,399 patients (Supplementary 
Tables 1 and 2, Extended Data Fig. 1). The sample distribution over age 


and primary tumour types broadly reflects the incidence of solid can- 
cers inthe Western world, including rare cancers (Fig. 1a). Sequencing 
data were analysed using an optimized bioinformatic pipeline based on 
open source tools (Methods, Supplementary Information) and iden- 
tified a total of 59,472,629 single nucleotide variants (SNVs), 839,126 
multiple nucleotide variants (MNVs), 9,598,205 insertions and deletions 
(indels) and 653,452 structural variants (SVs) (Supplementary Table 2). 


Mutational landscape of metastatic cancer 


We analysed the mutational burden of each class of variant per cancer 
type based on the tissue of origin (Fig. 1, Supplementary Table 2). In 
line with previous studies on primary cancers", we found extensive 
variation in the mutational load of up to three orders of magnitude 
both within and across cancer types. 

The median SNV counts per sample were highest in skin, predomi- 
nantly consisting of melanoma (44,000) and lung (36,000) tumours, 
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Fig. 1| Mutational load of metastatic cancer. a, Violin plot showing age 
distribution of each tumour type, with twenty-fifth, fiftieth and seventy-fifth 
percentiles marked. b, c, Cumulative distribution function plot (individual 
samples were ranked independently for each variant type) of mutational load for 
each tumour type for SNVs and MNVs (b) and indels and SVs (c). The median for 


with tenfold higher SNV counts than sarcomas (4,100), neuroendo- 
crine tumours (NETs) (3,500) and mesotheliomas (3,400). SNVs were 
mapped to COSMIC mutational signatures and were found to broadly 
match the patterns described in previous cancer cohorts per cancer 
type” (Extended Data Figs. 2, 3). However, several broad spectrum 
signatures such as S3, S8, S9 and S16 as well as some more specific 
signature (for example, S17 in specific tumour types) appear to be over- 
represented in our cohort. These observations may indicate enrichment 
of tumours that are deficient in specific DNA repair processes (S3), 
increased hypermutation processes (S9) among advanced cancers, 
or reflect the mutagenic effects of previous treatments”. 

The variation for MNVs was even greater, with lung (median of 821) 
and skin (median of 764) tumours having five times the median MNV 
counts of any other tumour type. This can be explained by the well- 
known mutational effect of UV radiation (CC>TT) and smoking (CC>AA) 
mutational signatures, respectively (Extended Data Fig. 2). Although 
only dinucleotide substitutions are typically reported as MNVs, 10.7% of 
the MNVs involve three nucleotides and 0.6% had four or more nucleo- 
tides affected. 

Indel counts were typically tenfold lower than SNVs, with a lower 
relative rate for skin and lung cancers (Fig. Ic). Genome-wide analysis 
ofindels at microsatellite loci identified 60 samples with microsatellite 
instability (MSI) (Supplementary Table 2), which represents 2.5% of all 
tumours (Extended Data Fig. 4). Notably, 67% of all indels inthe entire 
cohort were found in the 60 MSI samples, and 85% of all indels in the 
cohort were found in microsatellites or short tandem repeats. The high- 
est rates of MSI were observed in central nervous system (CNS) (9.4%), 
uterine (9.1%) and prostate (6.1%) tumours. For metastatic colorectal 
cancer lesions, we found an MSI frequency of only 4.0%, which is lower 
than that reported for primary colorectal cancer, and in line with bet- 
ter prognosis for patients with localized MSI colorectal cancer, which 
metastasizes less often’®. 

The median rate of SVs across the cohort was 193 per tumour, with the 
highest median counts observed in ovarian (412) and oesophageal (372) 
tumours, and the lowest in kidney tumours (71) and NETs (56). Simple 
deletions were the most commonly observed subtype of SV (33% of all 
SVs), and were the most prevalent in every cancer type except stomach 
and oesophageal tumours, which were highly enriched in translocations 
(Extended Data Fig. 2). 

To gain insight into the overall genomic differences between pri- 
mary and metastatic cancer, we compared the mutational burden in 
the Hartwig Medical Foundation (HMF) metastatic cohort with the 
Pancancer Analysis of Whole Genomes (PCAWG) dataset”, which, to 
our knowledge, is the largest comparable whole-genome sequenced 
tumour cohort (n=2,583) available so far, and which has 95% of biopsies 


each tumour typeis indicated by a horizontal bar. Dotted lines indicate the 
mutational loads in primary cancers from the PCAWG cohort“. Only tumour 
types with more than ten samples are shown (n=2,350 independent patients), 
and are ranked from the lowest to the highest overall SNV mutation burden 
(TMB). CUP, cancer of unknown primary. 


taken from treatment-naive primary tumours. In general, the SNV 
mutational load does not seem to be indicative for disease progres- 
sion as it is not significantly different in this study compared with the 
PCAWG for most cancer types (Fig. 1b). Prostate and breast cancer are 
clear exceptions with structurally higher mutational loads (¢<1x10°, 
Mann-Whitney test), which potentially reflects relevant tumour biol- 
ogy and is, for prostate cancer, consistent with other reports®””. CNS 
tumours also have a higher mutational load that is explained by the 
different age distributions of the cohorts. 

By contrast, the mutational loads of indels, MNVs and SVs are sig- 
nificantly higher across nearly all cancer types analysed (Fig. 1c). This 
is most notable for prostate cancer, in which we observe a more than 
fourfold increased rate of MNVs, indels and SVs. Although these obser- 
vations may represent the advancement of disease and the higher 
rate of certain mutational processes in metastatic cancers, they are 
also partially due to differences in sequencing depth and bioinfor- 
matic analysis pipelines (Extended Data Figs. 5, 6, Supplementary 
Information). 


Copy number alteration landscape 


Pan-cancer, the most highly amplified regions in our metastatic cancer 
cohort contain established oncogenes suchas EGFR, CCNE1, CCND1 and 
MDM2 (Fig. 2). The chromosomal arms 1q, 5p, 8q and 20qarealso highly 
enriched in moderate amplification across the cohort, with each affect- 
ing more than 20% of all samples. For amplifications of 5p and 8q, this 
is probably related to the common amplification targets of TERT and 
MYC, respectively. However, the targets of amplifications on1q, which 
are predominantly found in breast cancers (more than 50% of samples), 
and amplifications on 20q, which are predominantly foundin colorectal 
cancers (more than 65% of samples), are less clear. 

Overall, an average of 23% of the autosomal DNA per tumour has 
loss of heterozygosity (LOH). Unsurprisingly, 7P53 has the highest LOH 
recurrence at 67% of samples, and many of the other LOH peaks are also 
explained by well-known tumour-suppressor genes (TSGs). However, 
several clear LOH peaks are observed that cannot easily be explained 
by known TSG selection, suchas one on 8p (57% of samples). LOH at 8p 
has previously been linked to lipid metabolism and drug responses’, 
although the involvement of individual genes has not been established. 

There are remarkable differences in the LOH between cancer types 
(Supplementary Fig. 1). For instance, we observed LOH events on the 
3p arm in 90% of kidney samples” and LOH of the complete chromo- 
some 10 in 72% of CNS tumours (predominantly glioblastoma mul- 
tiforme”’). Furthermore, the mechanism for LOH in 7P53 is highly 
specific to tumour type, with ovarian cancers exhibiting LOH of the 
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Fig. 2| Copy number landscape of metastatic cancer. a, Proportion of 
samples with amplification and deletion events by genomic position pan- 
cancer. The inner ring shows the percentage of tumours with homozygous 
deletion (orange), LOH and significant loss (copy number < 0.6 sample ploidy; 
dark blue) and near copy neutral LOH (light blue). Outer ring shows percentage 
of tumours with high level amplification (>3x sample ploidy; orange), moderate 
amplification (>2x sample ploidy; dark green) and low level amplification (>1.4x 
amplification; light green). The scale on both rings is O-100% and inverted for 
theinner ring. The most frequently observed high-level gene amplifications 
(black text) and homozygous deletions (red text) are shown. b, Proportion of 
tumours with a WGD event (dark blue), grouped by tumour type. c, Sample 
ploidy distribution over the complete cohort for samples with and without 


full chromosome 17 in 75% of samples, whereas in prostate cancer (also 
70% LOH for TP53) this is nearly always caused by highly focal deletions. 

Unlike LOH events, homozygous deletions are nearly always 
restricted to small chromosomal regions. Not a single example was 
found in whicha complete autosomal arm was homozygously deleted. 
Homozygous deletions of genes are also surprisingly rare: we found 
only a mean of 2.0 instances per tumour in which one or several con- 
secutive genes are fully or partially homozygously deleted. In 46% of 
these events, a putative TSG was deleted. Loss of chromosome Y is a 
special case and is deleted in 36% of all male tumour genomes but varies 
strongly between tumour types, from 5% deleted in CNS tumours to 
68% deleted in biliary tumours (Extended Data Fig. 7). 

An extreme form of copy number change can be caused by whole- 
genome duplication (WGD). We found WGD events in 56% of all samples 
ranging from 15% in CNS to 80% in oesophageal tumours (Fig. 2). Thisis 
much higher than previously reported for primary tumours (25-37%) 
and from panel-based sequencing analyses of advanced tumours (30%). 


Significantly mutated genes 

Analyses for significantly mutated genes using strict significance cut-off 
values (q< 0.01) reproduced previous results on cancer drivers, and 
identified a few novel genes that are potentially related to metastatic 
cancer (Extended Data Fig. 8, Supplementary Table 3). Inthe pan-cancer 
analyses, we identified MLK4 (also known as MAP3K21; g=2x10™*)—a 
mixed lineage kinase that regulates the JNK, P38 and ERK signalling 
pathways and has been reported to inhibit tumorigenesis in colorectal 
cancer’. Inaddition, in our tumour type-specific analyses, we identified 
ametastatic breast cancer-specific significantly mutated gene—ZFPM1 
(also known as FOGI; q=8 x 10°), a zinc-finger transcription factor 
protein without clear links to cancer. Our cohort also lends support 
to previous findings for significantly mutated genes that are currently 
not included in the COSMIC Cancer Gene Census”. In particular, eight 
significantly mutated putative TSGs found previously in an independent 
dataset” were also found in our analyses, including GPS2(pan-cancer, 
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breast), SOX9 (pan-cancer, colorectal), 7G/F1(pan-cancer, colorectal), 
ZFP36L1 (pan-cancer, urinary tract) and ZFP36L2 (pan-cancer, colo- 
rectal), HLA-B (lymphoid), MGA (pan-cancer), KMT2B (skin) and RARG 
(urinary tract). 

Wealsosearched for genes that were significantly amplified or deleted 
(Supplementary Table 4). CDKN2A and PTENwere the most significantly 
deleted genes overall, but many of the top genes involved common frag- 
ile sites, particularly FH/T and DMD, which were deleted in 5% and 4% of 
samples, respectively. The role of common fragile sites in tumorigenesis 
is unclear and aberrations that affect these genes are frequently treated 
as passenger mutations that reflect localized genomic instability”. In 
CTNNBI1, we identified a recurrent in-frame deletion of the complete 
exon 3in12 samples, 9 of which are colorectal cancers. Notably, these 
deletions were homozygous but thought to be activating as CTNNB1 
normally acts as an oncogene in the WNT and B£-catenin pathway and 
none of these nine colorectal samples had any APC driver mutations. 
We also identified several significantly deleted genes not previously 
reported, including MLLT4 (n =13) and PARD3 (n=9). 

Unlike homozygous deletions, amplification peaks tend to be broad 
and often encompass large numbers of genes, making identification of 
the amplification target challenging. However, SOX4 (6p22.3) stands 
out as a significantly amplified single gene peak (26 amplifications) 
and is highly enriched in urinary tract cancers (19% of samples highly 
amplified). SOX4is known to be overexpressed in prostate, hepatocel- 
lular, lung, bladder and medulloblastoma cancers with poor prognostic 
features and advanced disease status and is a modulator of the PI3K and 
Akt signalling pathway’. 

Alsonotable was a broad amplification peak of 10 genes around ZMIZ1 
at 10q22.3 (n=32), which has not previously been reported. ZMIZ1 isa 
transcriptional coactivator of the protein inhibitor of activated STAT 
(PIAS)-like family and is a direct and selective cofactor of NOTCH1 in 
the development of T cells and leukaemia”. CDX2, previously identi- 
fied as an amplified lineage-survival oncogene in colorectal cancer”, is 
also highly amplified in our cohort with 20 out of 22 amplified samples 
found in colorectal cancer, representing 5.4% of all colorectal samples. 


Driver mutation catalogue 


We created acomprehensive catalogue of mutations in known (COSMIC 
curated genes”) and newly discovered (ref. “ and this study) cancer 
genes across all samples and variant classes, similar to that previously 
described for primary tumours” (N. Lopez, personal communication). 
We used a prioritization scheme to give a likelihood score for each 
mutation being a potential driver event. By taking into account the 
proportion of SNVs and indels estimated to be passengers using the 
dNdScv R package, we found 13,384 somatic candidate driver events 
among the 20,071 identified mutations in the combined gene panel 
(Supplementary Table 5), together with 189 germline predisposition 
variants (Supplementary Table 6). The somatic candidate driversinclude 
7,400 coding mutations, 615 non-coding point-mutation drivers, 2,700 
homozygous deletions (25% of which are in common fragile sites), 2,392 
focalamplifications and 276 fusion events. For non-coding variants, only 
essential splice sites and promoter mutations in TERT were includedin 
thestudy owing tothe currentlack of robust evidence for other recurrent 
oncogenic non-coding mutations™. A total of 257 variants were found 
at 5 known recurrent variant hotspots’ and included in the candidate 
driver catalogue. 

For the cohort as a whole, 55% of point mutations in the gene panel 
candidate driver catalogue were predicted to be genuine driver events, 
using our prioritization scheme (Methods). To facilitate the analysis of 
variants of unknown significance at a per-patient level, we calculateda 
sample-specific likelihood score for each point mutation being a driver 
event by taking into account the mutational burden of the sample, 
the biallelic inactivation status for TSGs, and hotspot positions for 
oncogenes. Predictions of pathogenic variant overlap with known 
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Fig. 3 | The most prevalent driver genes in metastatic cancer. a—c, The most 
prevalent somatically mutated oncogenes (a), TSGs (b) and germline 
predisposition variants (c). From left to right, the heat map shows the 
percentage of samples in each cancer type that are found to have each gene 
mutated; absolute bar chart shows the pan-cancer percentage of samples with 


biology—for example, clustering of benign missense variants in the 3’ 
half of the APC gene (Supplementary Fig. 2)—fits with the absence of 
FAP-causing germline variants in this part of the gene™. 

Overall, the catalogue is similar to previous inventories of cancer 
drivers, with 7P53 (52% of samples), CDKN2A (21%), PIK3CA (16%), APC 
(15%), KRAS (15%), PTEN (13%) and TERT (12%) identified as the most com- 
monly mutated genes, which together make up 26% of all the candidate 
driver mutations in the catalogue (Fig. 3). However, all of the ten most 
frequently mutated genesin our catalogue were reported at ahigher rate 
than for primary cancers®, which may reflect the more advanced disease 
state. AR and ESR1 in particular are more prevalent, with putative driver 
mutations in 44% of prostate and 16% of breast cancers, respectively. 
Both genes are linked to resistance to hormonal therapy, acommon 
treatment for these tumour types, and have been previously reported 
as enriched in advanced metastatic cancer? but are identified at higher 
rates inthis study. 

At the per-patient level, the mean number of total candidate driver 
events per patient was 5.7, with the highest rate in urinary tract tumours 
(mean value of 8.0) and the lowest in NETs (mean of 2.8) (Fig. 4). Oesopha- 
geal and stomach tumours also had increased driver counts, largely 
owing toa much higher rate of deletions incommon fragile site genes 
(mean of 1.6 for both stomach and oesophageal tumours) compared 
with other cancer types (pan-cancer mean of 0.3). Fragile sites aside, the 
differential rates of drivers between cancer types in each variant class 
do correlate with the relative mutational load (Extended Data Fig. 4), 
with the exception of skin cancers, which have a lower than expected 
number of SNV drivers. 

In 98.6% ofall samples, at least one somatic candidate driver mutation 
or germline predisposition variant was found. Of the 34 samples with 
no identified driver, 18 were NETs of the small intestine (representing 
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the given gene mutated; relative bar chart shows the breakdown by type of 
alteration. For TSGs (b), the final bar chart shows the percentage of samples 
witha driver in which the gene is biallelically inactivated, and for germline 
predisposition variants (c), the final bar chart shows the percentage of samples 
with loss of wild type in the tumour. 


49% of all patients of this subtype). This probably indicates that small 
intestine NETs have a distinct set of yet drivers that are not captured 
in any of the cancer gene resources used and are also not prevalent 
enough in our relatively small NET cohort to be detected as significant. 
Alternatively, NETs could be mainly driven by epigenetic mechanisms 
that are not detected by WGS®*. 

The number of amplified driver genes varied significantly between 
cancer types (Extended Data Fig. 7), with highly increased rates per 
sample in breast cancer (mean of 2.1), oesophageal cancer (mean of 1.8), 
urinary tract and stomach cancers (both mean of 1.7), nearly no 
amplification drivers in kidney cancer (mean of 0.1), and none in the 
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Fig. 4| Number of drivers and types of mutation per sample by tumour type. 
a, Violin plot showing the distribution of the number of drivers per sample 
grouped by tumour type (number of patients per tumour typeis provided). 
Black dots indicate the mean values for each tumour type. b, Relative bar chart 
showing the breakdown per cancer type of the type of alteration. 
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Fig. 5| Clinical associations and actionability. a, Percentage of samplesin 
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biomarkers with strong biological evidence or clinical trials that indicate that 
they are actionable. On-label indicates treatment registered by federal 
authorities for that tumour type, whereas off-label indicates a registration for 
other tumour types. b, Break down of the actionable variants by variant type. 


mesothelioma cohort. In tumour types with high rates of amplifica- 
tions, these amplifications are generally found across a broad spectrum 
of oncogenes, which suggests that there are mutagenic processes active 
in these tissues that favour amplifications, rather than tissue-specific 
selection of individual driver genes. AR and EGFR are notable excep- 
tions, with highly selective amplifications in prostate cancer, and in 
CNS and lung cancers, respectively, inline with previous reports”””’*, 
Notably, we also found twofold more amplification drivers in samples 
with WGD events despite amplifications being defined as relative to 
the average genome ploidy. 

The189 germline variants identified in 29 cancer predisposition genes 
(present in 7.9% of the cohort) consisted of 8 deletions and 181 point 
mutations (Fig. 3c, Supplementary Table 6). The top five affected genes 
(containing nearly 80% of variants) were the well-known germline drivers 
CHEK2, BRCA2, MUTYH, BRCA1 and ATM. The corresponding wild-type 
alleles were found to be lost inthe tumour sample in more than half of 
the cases, either by LOH or somatic point mutation, indicating a high 
penetrance for these variants, particularly in BRCAI (89% of cases), APC 
(83%) and BRCA2 (79%). 

The 276 fusions consisted of 168 in-frame coding fusions, 90 cis- 
activating fusions that involve repositioning of regulatory elements in 
5’ genic regions, and 18 in-frame intragenic deletions in which one or 
more exons was deleted (Supplementary Table 7). ERG (n= 88), BRAF 
(n=17), ERBB4 (n=16), ALK (n=12), NRG1 (9 samples) and ETV4 (n=7) 
were the most commonly observed 3’ partners, which together make 
up more than half of the fusions. In total, 76 out of the 89 ERG fusions 
were 7MPRSS2-ERG and affected 36% of all prostate cancer samples in 
the cohort. There were 146 fusion pairs not previously recorded in CGI, 
OncoKb, COSMIC or CIViC databases”? ™. 

We found that 71% of somatic driver point mutations in oncogenes 
occur at or within five nucleotides already known to pathogenic muta- 
tional hotspots. In the six most prevalent oncogenes (KRAS, PIK3CA, 
BRAF, NRAS, TERT and ESR1), the rate was 97% (Extended Data Fig. 9). 
Furthermore, in many of the key oncogenes, we document several likely 
activating but non-canonical variants near known mutational hotspots, 
particularly in-frame indels. Despite in-frame indels being exception- 
ally rare overall (mean of 1.7 per tumour), we found an excess in known 
oncogenes including P/K3CA (n=18), KIT(n=17), ERBB2(n=10) and BRAF 
(n=8) frequently occurring at or near known hotspots (Extended Data 
Fig. 9). In FOXA1, we identified ten in-frame indels that are highly enriched 
in prostate cancer (seven out of ten cases) and clustered at two locations 
that were not previously associated with pathogenic mutations”. 
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For TSGs, our results strongly support the Knudson two-hit hypoth- 
esis’, with 80% of all TSG drivers found to have biallelic inactivation by 
geneticalterations (Fig.3), homozygous deletion (32%), multiplesomatic 
point mutations (7%), ora point mutationin combination with LOH (41%). 
This rate is, to our knowledge, the highest observed in any large-scale 
WGS cancer study. For many key TSGs, the biallelic inactivation rate is 
almost 100%—7P53 (93%), CDKN2A (97%), RB1 (94%), PTEN (92%) and 
SMAD4 (96%)—which suggests that biallelic genetic inactivation of these 
genes is astrong requirement for metastatic cancer. Other prominent 
TSGs, however, have lower biallelic inactivation rates, including ARIDIA 
(55%), KMT2C (49%) and ATM (49%). For these cases, the other allele 
may also be inactivated by non-mutational epigenetic mechanisms, 
or tumorigenesis may be driven via a haploinsufficiency mechanism. 

We examined the pairwise co-occurrence of driver gene mutations 
per cancer type and found ten combinations of genes that were signifi- 
cantly mutually exclusively mutated, and ten combinations of genes 
that were significantly concurrently mutated (Extended Data Fig. 10). 
Although most of these relationships are well established, in breast 
cancer, we found new positive relationship for GATA3-VMPI (q=6 x 10°) 
and FOXAI-PIK3CA (q=3 x 10°), and negative relationships for ESRI- 
TP53 (q =9 x 10*) and GATA3-TP53 (q=5 x 10°). These findings will 
need further validation and experimental follow-up to understand the 
underlying biology. 


Clonality of variants 


To obtain insight into ongoing tumour evolution dynamics, we examined 
the clonality of all variants. Notably, only 6.6% of all SNVs, MNVs and 
indels across the cohort and just 3.7% of the point-mutation drivers were 
found to be subclonal (Extended Data Fig. 11). The low proportion ofsam- 
ples with subclonal variants could be partially due to the detection limits 
of the sequencing approach (sequencing depth, bioinformatic analysis 
settings), particularly for low purity samples. However, even for samples 
with more than 80% purity, the total proportion of subclonal variants 
only reaches 10.6% (Extended Data Fig. 11). Furthermore, sensitized 
detection of variants at hotspot positions in cancer genes showed that 
our analysis pipeline detected over 96% of variants with allele frequen- 
cies above 3%. Although the cohort contains some samples with high 
fractions of subclonal variants, overall the metastatic tumour samples 
are relatively homogeneous without the presence of multiple diverged 
major subclones. Low intratumour heterogeneity may be in part attrib- 
uted to the fact that nearly all biopsies were obtained by acore needle 
biopsy, which results in highly localized sampling, but is nevertheless 
much lower than previous observations in primary cancers”. 

Inthe 117 patients with independently collected repeat biopsies from 
the same patient (Supplementary Table 8), we found 11% of all SNVs to 
be subclonal. Although 71% of clonal variants were shared between 
biopsies, only 29% of the subclonal variants were shared. We cannot 
exclude the presence of larger amounts of lower frequency subclonal 
variants, and our results suggest a model in which individual metastatic 
lesions are dominated by a single clone at any one point in time and that 
more limited tumour evolution and subclonal selection takes places 
after distant metastatic seeding. This contrasts with observations in 
primary tumours, in which larger degrees of subclonality and several 
major subclones are more frequently observed’, but supports other 
recent studies that demonstrate minimal driver gene heterogeneity in 
metastases**, 


Clinical associations 

We analysed opportunities for biomarker-based treatment for all 
patients by mapping driver events to clinical annotation databases 
(CGI, CIViC® and OncoKB*’). In 1,480 patients (62%), at least one 
predicted candidate ‘actionable’ event was identified (as defined in 
the Methods, Supplementary Table 9), inline with results from primary 


tumours”. Half of the patients with a predicted candidate actionable 
event (31% of total) contained a biomarker with a predicted sensitivity 
toadrugat level A (approved anti-cancer drugs) and lacked any known 
resistance biomarkers for the same drug (Fig. 5a). In 18% of patients, 
the suggested therapy was a registered indication, whereas in 13% of 
cases it was outside the labelled indication. Ina related pilot study with 
implementation in 215 treated patients, we showed that such treatment 
with anticancer drugs outside of their approved label can result in overall 
clinical benefits**. In a further 31% of patients, a level B (experimental 
therapy) biomarker was identified. The predicted actionable events 
spannedall variant classes including 1,815 SNVs, 48 MNVs, 190 indels, 745 
copy number alterations, 69 fusion genes and 60 patients with micros- 
atellite instability (Fig. Sb). 

Tumour mutation burden (TMB) is animportant emerging biomarker 
for responses to immune checkpoint inhibitor therapy as it is a proxy 
for the amount of neo-antigens inthe tumour cells. In two large phase 3 
trials of patients with non-small-cell lung cancer, both progression-free 
survival and overall survival are significantly improved with first line 
immunotherapy as compared with chemotherapy for patients whose 
tumours have a TMB of greater than 10 mutations per megabase*”**. 

Although various clinical studies based onthis parameter are currently 
emerging, TMB was not yet included in the above actionability analysis. 
However, when applying this cut-off to all samples in our cohort, 18% of 
patients would qualify, varying from 0% for patients with mesothelioma, 
liver and ovarian cancers to more than 50% for patients with lung and 
skin cancers (Extended Data Fig. 4b). 


Data availability and resource access 


The Hartwig Medical cohort described here is, to our knowledge, the 
largest metastatic whole-genome cancer resource, and based ona broad 
patient consent was specifically developed as a community resource 
for international academic cancer research. Somatic variants and basic 
clinical data (tumour type, gender, age) are publicly available and can 
be explored at the patient, cohort and gene level through a graphical 
interface (database.hartwigmedicalfoundation.nl) originally devel- 
oped by the International Cancer Genome Consortium”. Patient-level 
genome-wide germline and somatic data (raw BAM files and annotated 
variant call data) are considered privacy sensitive and available through 
anaccess-controlled mechanism (see www. hartwigmedicalfoundation. 
nl/en for details). 

The cohortis still expanding, with data from 4,000 patients already 
available, and includes data that go beyond the basic clinical and 
genomic data analysed in this paper such as post-biopsy treatments 
and responses, and previous treatment information. 


Discussion 


Genomic testing of tumours faces numerous challenges in meeting clini- 
cal needs, including the interpretation of variants of unknown signifi- 
cance, the steadily expanding universe of actionable genes—often with 
anincreasingly small fraction of patients affected—and the development 
of advanced genome-derived biomarkers such as tumour mutational 
load, DNA repair status and mutational signatures. Our results demon- 
strate that WGS analyses of metastatic cancer can provide novel and 
relevant insights and are instrumental in addressing some of the key 
challenges of precision medicine in cancer. 

First, oursystematicand large-scale pan-cancer analyses on metastatic 
cancer tissue allowed for the identification of several cancer drivers and 
mutation hotspots. Second, the driver catalogue analyses can be used 
to mitigate the problem of variants of unknown significance interpreta- 
tion” both by leveraging previously identified pathogenic mutations 
(accounting for more than two-thirds of oncogenic point-mutation 
drivers) and by careful analysis of the biallelic inactivation of putative 
TSGs that accounts for over 80% of TSG drivers in metastatic cancer. 


Third, we demonstrate the importance of accounting for all types of 
variant, including large-scale genomic rearrangements (via fusions and 
copy number alteration events), which account for more than half of 
all drivers, but also activating MNVs and indels that we have shown are 
commonly found in many key oncogenes. Fourth, we have shown that 
using WGS, even with very strict variant calling criteria, we could find 
candidate driver variants in more than 98% of all metastatic tumours, 
including predicted putatively actionable events ina clinical and experi- 
mental setting for up to 62% of patients. 

Although we did not find metastatic tumour genomes to be funda- 
mentally different from primary tumours in terms of the mutational 
landscape or genes that drive advanced tumorigenesis, we described 
characteristics that could contribute to responsiveness to therapy or 
resistance in individual patients. In particular, we showed that WGD 
events are a more pervasive element of tumorigenesis than previously 
understood, affecting over half of all metastatic cancers. We also found 
metastatic lesions to be less heterogeneous than reported for primary 
tumours, although the limited sequencing depth does not allow conclu- 
sions to be made about low-frequency subclonal variants. 

The cohort described here provides a valuable complementary 
resource to whole-sequence-based data of primary tumours such as 
the PCAWG project in advancing fundamental and translational cancer 
research. Although it was established as a pan-cancer resource, several of 
thetumour type-specific cohorts are very largein their own rights. Already 
two of these cohorts (prostate and breast™) have been analysed in more 
detail, providing enhanced cancer subtype stratification and revealing 
characteristic genomic differences between primary and metastatic 
tumours. As the Hartwig Medical cohort includes a mix of treatment- 
naive metastatic patients and patients who have undergone (extensive) 
previous systemic treatments, it provides unique opportunities to study 
responses and resistance to treatments and discover predictive biomark- 
ers, as these data are available for discovery and validation studies. 


Online content 
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Methods 


Adetailed description of methods and validations is available as Supple- 
mentary Information. No statistical methods were used to predetermine 
samplesize. The experiments were not randomized, andinvestigators were 
not blinded to allocation during experiments and outcome assessment. 


Sample collection 

Patients with advanced cancer not curable by local treatment options 
and being candidates for any type of systemic treatment and any line 
of treatment were included as part of the CPCT-02 (NCT01855477) and 
DRUP (NCT02925234) clinical studies, which were approved by the 
medical ethical committees (METC) of the University Medical Center 
Utrecht and the Netherlands Cancer Institute, respectively. A total of 
41 academic, teaching and general hospitals across The Netherlands 
participated in these studies and collected material and clinical data 
by standardized protocols”. Patients have given explicit consent for 
whole-genome sequencing and data sharing for cancer research pur- 
poses. Core needle biopsies were sampled from the metastatic lesion, 
or when considered not feasible or not safe, from the primary tumour 
site and frozen in liquid nitrogen. A single 6-um section was collected 
for haematoxylin and eosin (H&E) staining and estimation of tumour 
cellularity by an experienced pathologist and 25 sections of 20-tum were 
collected ina tube for DNA isolation. In parallel, atube of blood was col- 
lected. Leftover material (biopsy, DNA) was stored in biobanks associ- 
ated with the studies at the University Medical Center Utrecht and the 
Netherlands Cancer Institute. 


Whole-genome sequencing and variant calling 

DNA was isolated from biopsies (>30% tumour cellularity) and blood 
according tothe supplier’s protocols (Qiagen) using the DSP DNA Midi 
kit for blood and QIAsymphony DSP DNA Mini kit for tissue. A total of 
50-200 ng of DNA (sheared to average fragment length of 450nt) was 
used as input for TruSeq Nano LT library preparation (Illumina). Bar- 
coded libraries were sequenced as pools on HiSeqX generating 2 x 150 
read pairs using standard settings (Illumina). BCL output was converted 
using bcl2fastq tool (Illumina, v.2.17 to v.2.20) using default parameters. 
Reads were mapped to the reference genome GRCH37 using BWA-mem 
v.0.7.5a~, duplicates were marked for filteringand INDELs were realigned 
using GATK v.3.4.46 IndelRealigner™. GATK HaplotypeCaller v.3.4.46° 
was run to call germline variants in the reference sample. For somatic 
SNV and indel variant calling, GATK BQSR® was applied to recalibrate 
base qualities. SNV and indel somatic variants were called using Strelka 
v.1.0.14” with optimized settings and post-calling filtering. Structural 
Variants were called using Manta (v.1.0.3)°* with default parameters 
followed by additional filtering to improve precision using an internally 
built tool (Breakpoint-Inspector v.1.5). To assess the effect of sequenc- 
ing depth on variant calling sensitivity, we downsampled the BAMS of 
10 samples at random by 50% and reran the identical somatic variant 
calling pipeline. 


Purity, ploidy and copy number calling 

Copy number calling and determination of sample purity were per- 
formed using PURPLE (PURity & PLoidy Estimator), which combines 
B-allele frequency, read depth and structural variants to estimate the 
purity of atumour sample and determine the copy number and minor 
allele ploidy for every base in the genome. The purity and ploidy esti- 
mates and copy number profile obtained from PURPLE were validated on 
in silico simulated tumour purities, by DNA fluorescence in situ hybridi- 
zation (FISH) and by comparison with an alternative tool (ASCAT®’). 
ASCAT was runonGC-corrected data using default parameters except 
for gamma, which was set to 1 (which is recommended for massively 
parallel sequencing data). We implement a simple heuristic that deter- 
mines ifaWGD event has occurred: major allele ploidy > 1.5 on at least 
50% of at least 11 autosomes as the number of duplicated autosomes 


per sample (that is, the number of autosomes which satisfy the above 
rule) follows a bimodal distribution with 95% of samples have either <6 
or >15 autosomes duplicated. 


Sample selection for downstream analyses 

Following copy number calling, samples were filtered out based on 
absence of somatic variants, purity <20%, and GC biases, yielding a 
high-quality dataset of 2,520 samples. Where multiple biopsies exist 
for asingle patient, the highest purity sample was used for downstream 
analyses (resulting in 2,399 samples). 


Mutational signature analysis 

Mutational signatures were determined by fitting SNV counts per 96 
tri-nucleotide context to the 30 COSMIC signatures” using the muta- 
tionalPatterns package. Residuals were calculated as the sum of the 
absolute difference between observed and fitted across the 96 buckets. 
Signatures with <5% overall contribution to asample or absolute fitted 
mutational load <300 variants were excluded fromthe summary plot. 


Germline predisposition variant calling 

We searched for pathogenic germline variants (SNVs, indels and copy 
number alterations) in a broad list of 152 germline predisposition genes 
previously curated, using GATK HaplotypeCaller® output from each 
sample. For each variant identified, we assessed the genotype in the 
germline (HET or HOM), whether there was a second somatic hit in the 
tumour, and whether the wild type or the variant itself was lost by acopy 
number alteration. We observed that for the variants in many of the 152 
predisposition genes that a loss of wild type in the tumour via LOH was 
lower than the average rate of LOH across the cohort and that fewer 
than 5% of observed variants had asecond somatic hit inthe same gene. 
Moreover, in many of these genes, the ALT variant was lost via LOH as 
frequently as the wild type, suggesting that a considerable portion of the 
566 variants may be passengers. For our downstream analysis and driver 
catalogue, we therefore restricted our analysis to amore conservative 
‘high confidence’ list including only the 25 cancer related genes inthe 
ACMGsecondary findings reporting guidelines (v.2.0), together with 
four curated genes (CDKN2A, CHEK2, BAP1 and ATM), selected because 
these are the only additional genes from the larger list of 152 genes with 
asignificantly increased proportion of called germline variants with loss 
of wild type in the tumour sample. 


Clonality and biallelic status of point mutations 

The ploidy of each variant is calculated by adjusting the observed VAF 
by the purity and then multiplying by the local copy number to work 
out the absolute number of chromatids that contain the variant. We 
mark a mutation as biallelic (that is, no wild type remaining) if variant 
ploidy > local copy number — 0.5. For each variant, we also determinea 
probability that it is subclonal. This is achieved via a two-step process 
involving fitting the somatic ploidies for each sample into a set of clonal 
and subclonal peaks and calculating the probability that each individual 
variant belongs to each peak. Subclonal counts are calculated as the 
total density of the subclonal peaks for each sample. Subclonal driver 
counts are calculated as the sum across the driver catalogue of subclonal 
probability x driver likelihood. 


MSI status determination 

To determine the MSI status, we used the method described by the 
MSlseq tool® and counted the number of indels per million bases occur- 
ring in homopolymers of five or more bases or dinucleotide, trinucleo- 
tide and tetranucleotide sequences of repeat count four or more. MSIseq 
score of >4 were considered MSI. 


Significantly mutated driver genes 
We used Ensembl v.89.37 as a basis for gene definitions and have taken 
the union of Entrez identifiable genes and protein-coding genes as our 
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base panel (25,963 genes of which 20,083 genes are protein coding). 
Pan-cancer and at an individual cancer level we tested the normalized 
nonsynonymous (dN) to synonymous substitution (dS) rate (that is, 
dN/dS) using dNdScv” against a null hypothesis that dN/dS =1 for each 
variant subtype. To identify significantly mutated genes in our cohort, 
we used astrict significance cut-off value of g< 0.01. 

To search for significantly amplified and deleted genes, we first calcu- 
lated the minimum exonic copy number per gene. For amplifications, we 
searched for all the genes with high-level amplifications only (defined 
as minimum exonic copy number >3 x sample ploidy). For deletions, 
we searched for all the genes in each sample with either full or partial 
homozygous gene deletions (defined as minimum exonic copy number 
< 0.5) excluding the Y chromosome. We then searched separately for 
amplifications and deletions, ona per-chromosome basis, for the most 
significant focal peaks, using an iterative GISTIC-like peel off method®. 
Most of the deletion peaks resolve clearly toa single target gene, which 
reflects the fact that homozygous deletions are highly focal, but for 
amplifications this is not the case and most of our peaks have ten or 
more candidates. We therefore annotated the peaks, to choose a single 
putative target gene using an objective set of automated curation rules. 
Finally, filtering was applied to yield highly significant deletions and 
amplifications. 

Homozygous deletions were also annotated as common fragile sites 
based on their genomic characteristics, including a strong enrichmentin 
long genes (>500,000 bases) anda high rate (>30%) of deletions between 
20 kb and 1 Mb”. 


Somatic driver catalogue construction 
Wecreated a catalogue of mutationsin known cancer genes in our cohort 
across all variant types ona per-patient basis. This was done ina similar 
incremental manner to that previously described* (N. Lopez, personal 
communication) in which we first calculated the number of genes with 
putative driver mutations ina broad panel of known and significantly 
mutated genes across the full cohort, and then assigned the candidate 
driver mutations for each gene to individual patients by ranking and 
prioritizing each of the observed variants. Key points of difference in 
this study were both the prioritization mechanism used and our choice 
to ascribe each mutation a probability of being a driver rather thana 
binary cut-off based on absolute ranking. 

The four steps to create the catalogue are as follows. (1) Create 
a panel of candidate genes for point mutations using significantly 
mutated genes and known cancer genes using the union of Mar- 
tincorena significantly mutated genes” (filtered to significance of 
q< 0.01), HMF significantly mutated genes (q< 0.01) at global level or 
at cancer type level and COSMIC curated genes” (v.83). (2) Determine 
TSG or oncogene status of each significantly mutated gene using a 
logistic regression classification model (trained using COSMIC anno- 
tation). (3) Add mutations from all variant classes to the catalogue 
when meeting any of the following criteria: (i) all missense and in- 
frame indels for panel oncogenes; (ii) all non-synonymous and essen- 
tial splice point mutations for TSGs; (iii) all high-level amplifications 
for significantly amplified target genes and panel oncogenes; (iv) 
all homozygous deletions for significantly deleted target genes and 
panel TSGs; (v) all known or promiscuous in-frame gene fusions; and 
(vi) recurrent TERT promoter mutations. (4) Calculate a per-sample 
likelihood score (between 0 and 1) for each mutation in the catalogue 
as a potential driver event, to ensure that only likely pathogenic and 
excess mutations (based on dN/dS) are used to determine the number 
of drivers. All putative driver mutation counts reported at a per-cancer 
type or sample level refer to the sum of driver likelihoods for that 
cancer type or sample. 


Clinical associations and actionability analysis 
To determine clinical associations and potential actionability of the 
variants observed in each sample, we compared all variants with 


three external clinical annotation databases (OncoKB*°, CGI“ and 


CIViC*’) that were mapped to a common data model as defined by 
https://civicdb.org/help/evidence/evidence-levels. Here, we con- 
sidered only A and B level variants. This classification of potential 
actionable events can also be mapped to the recently proposed ESMO 
Scale for Clinical Actionability of molecular Targets (ESCAT)® as fol- 
lows: ESCAT I-A+B (for A on-label) and I-C (for A off-label) and ESCAT 
II-A+B (for B on-label) and III-A (for B off-label). For each candidate 
actionable mutation, it was also determined to be either on-label 
(that is, evidence supports treatment in that specific cancer type) 
or off-label (evidence exists in another cancer type). To do this, we 
annotated both the patient cancer types and the database cancer 
types with relevant DOIDs, using the disease ontology database”. For 
each candidate actionable mutation in each sample, we aggregated 
all the mapped evidence that was available supporting both on-label 
and off-label treatments at the A or B evidence level. Treatments that 
also had evidence supporting resistance based on other biomarkers 
inthe sample at the same or higher evidence level were excluded as 
non-actionable. Samples classified as MSI in our driver catalogue 
were also mapped as actionable at level A evidence based on clinical 
annotation in the OncoKB database. For each sample, we reported 
the highest level of predicted actionability, ranked first by evidence 
level and then by on-label vs off-label. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


All data describedin this study are freely available for academic use from 
the Hartwig Medical Foundation through standardized procedures and 
request forms that can be found at https://www.hartwigmedicalfounda- 
tion.nl/en/appyling-for-data/. 

Available data include germline and tumour raw sequencing data (BAM 
files, including non-mapped reads), annotated somatic and germline 
variants (VCF files with annotated SNV and indels, and pipeline output 
files for purity and ploidy status as well as copy number alteration and 
structural variants) and clinical data. Examples of output files can be 
found at https://resources.hartwigmedicalfoundation.nl. In brief, a 
data request can be initiated by filling out the standard form in which 
intended use of the requested data is motivated. First, an advice on 
scientific feasibility and validity is obtained from experts in the field 
that is used as input by an independent data access board who also 
evaluates ifthe intended use of the data is compatible with the consent 
given bythe patients and if there would be any applicable legal or ethical 
constraints. Upon formal approval by the data access board, astandard 
license agreement that does not have any restrictions regarding intel- 
lectual property resulting from the data analysis needs to be signed 
by an official organization representative before access to the data 
are granted. After approval, access to datais provided under alicense 
model, with the only main restriction that the data can only be used 
for the research detailed in the original request. Raw data files will be 
made available through a dedicated download portal with two-factor 
authentication. 

Non-privacy sensitive somatic variants can also be browsed and explored 
through an open access web-based interface which can be accessed at 
http://database.hartwigmedicalfoundation.nl/. 


Code availability 


All code used is open source and available from third parties or devel- 
oped by Hartwig Medical Foundation (https://github.com/hartwigmedi- 
cal/). A full list of tools and versions used including links to the source 
code is provided in the Supplementary Information. 
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Extended Data Fig. 1| Hartwig sample workflow, biopsy locations and 
sequence coverage. a, Sample workflow from patient to high-quality WGS 
data. A total of 4,018 patients were enrolled in the study between April 2016 and 
April 2018. For 9% of patients, no blood and/or biopsy material was obtained, 
mostly because conditions of patients prohibited further study participation. 
Up to four fresh-frozen biopsies were obtained per patient, and were 
sequentially analysed to identify a biopsy with more than 30% tumour 
cellularity as determined by routine histology assessment. For 859 patients, no 
suitable biopsy was obtained, and 2,796 patients were further processed for 
WGS analysis. In total, 44 and 29 samples failed in either DNA isolation or 
library preparation and raw WGS data quality control tests, respectively. Fora 
further 385 samples, the WGS data were of good quality, but the determination 


of tumour purity based on WGS data (PURity & PLoidy Estimator; PURPLE) was 
less than 20%, making reliable and comprehensive somatic variant calling 
impossible and were therefore excluded. Eventually, 2,338 pairs of tumour and 
normal tissue samples with high-quality WGS data were obtained, which were 
supplemented with 182 pairs from pre-April 2016, adding up to 2,520 pairs of 
tumour and normal samples that were included in this study. b, Breakdown of 
cohort by biopsy location. Tumour biopsies were taken froma broad range of 
locations. Primary tumour type is shown on the left, and the biopsy location on 
the right. c, Distribution of sample sequencing depth for tumour and blood 
reference samples (n= 2,520 independent samples for each category). The 
median for each is indicated by a horizontal bar. 
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Extended Data Fig. 2| Mutational context distribution per tumour type. DEL, deletion (with microhomology (MH), in repeats and other); DUP, tandem 
a-e, Variant subtype, mutational context or signature perindividualsamplefor duplication; INV, inversion; TRL, translocation; INS, insertion. Highly 
each SNV (a), SNV by COSMIC signature (b), MNV (c), indel (d) or SV (e). Each characteristic known patterns can be discerned, for example the high rates of 


column chart is ranked within tumour type by mutationalloadfromlowtohigh = C>TSNVs,CC>TTMNVs and COSMIC S18 for skin tumours, and high rates of C>A 
in that variant class. MNVs are classified by the dinucleotide substitution, with SNVs and COSMIC S4 for lung tumours. 
‘NN’ referring to any dinucleotide combination. SVs are classified by type. 
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Extended Data Fig. 3 | SNV mutational signatures. a, Prevalence and median 
mutational load of fitted COSMIC SNV mutational signature per cancer type 
(the number of patients per category is provided). The observed distribution 
largely reflects the patterns observed from primary cancers". b, Box plots of 
relative residuals in fits per cancer type (sum of absolute difference between 
the fitted and actual divided by total mutational load). Boxes represent the 
twenty-fifth to seventy-fifth percentiles, and whiskers extend to the highest 
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and lowest values within 1.5x the upper/lower quartile distance, with outliers 
shownas dots. c, Proportion of variants by 96 trinucleotide mutational context 
for two selected samples with high residuals and high mutational load. Top and 
bottom panels represent the highest outliers for breast (HMFO02896) and 
oesophagus (HMFO01562) cancers, respectively, from b. Both of these samples 
were previously treated with the experimental drug SYD985—a duocarmycin- 
based HER2-targeting antibody-drug conjugate® 
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Extended Data Fig. 4 | Mutational load, genome-wide analyses and drivers. versus SNVs (c), indels versus SVs (d), and SVs versus SNVs (e). MSI (MSIseq 

a, Proportion of samples by cancer type classified as microsatellite instable score > 4) and high TMB (>10 SNVs per Mb) thresholds are indicated. f-h, Mean 
(MSIseq score >4).b, Proportion of samples withahigh mutational burden(TMB —‘ mutational load versus driver rate for SNVs (f), indels (g) and SVs (h), grouped by 
>10 SNVs per Mb). c-e, Scatter plots of mutational load per sample for indels cancer type. MSI samples were excluded. 
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Extended Data Fig. 6 | Effect of bioinformatic analysis pipeline on variant 
calling. a-~d, Comparison of observed mutational count per sample for SNVs 
(a), MNVs (b), indels (c) and SVs (d) on 24 patient samples analysed by the 
PCAWG and HMF pipelines. The PCAWG pipeline was found to have a 43% lower 
sensitivity for indels (whichis based ona consensus calling), 18% lower for SVs 
(based ona different algorithm) and 6% lower for MNVs (only includes MNVs 
involving two nucleotides), with nearly the same sensitivity for SNVs. 

e, f, Cumulative distribution function plot for each tumour type (the number of 
independent patients per category is provided) of coverage and pipeline- 
adjusted mutational load for SNVs and MNVs (e) and indels and SVs (f). 
Mutational loads as shown in Fig. 1 were adjusted for the sensitivity effects 
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caused by differences in sequencing depth coverage (Extended Data Fig. 4) and 
analysis pipeline differences (a—d). After this correction, the TMB between 
primary and metastatic cohorts across all variant types are much more 
comparable (e, f), which indicates that technical differences docontribute tothe 
reported mutational load differences between primary and metastatic tumours. 
Prostate cancer is the most notable exception, with approximately twice the 
TMBinall variant classes, although more subtle differences, potentially driven 
by biology, can also be observed for other tumour and mutation types. For 
cancer types that are comparable with the PCAWG cohort, the equivalent 
PCAWG numbers are shown by dotted lines. The median for each cohort is 
shown by a horizontal line. 
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Extended Data Fig. 7 | Somatic Y chromosome loss and driver 
amplifications. a, Proportion of male tumours with somatic loss of more than 
50% of Y chromosome (dark blue) grouped by tumour type. b, Mean rate of 
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amplification drivers per cancer type. c, Breakdown of the number of 
amplification drivers per gene by cancer type. d, Mean rate of drivers per 
variant type for samples with and without WGD. 
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Extended Data Fig. 8 | Significantly mutated genes. Tile chart showing genes COSMIC gene census or curated gene databases. Gene names marked inred are 
found to be significantly mutated per cancer type (the number of independent novel in this study. Significance (Poisson with Benjamini-Hochberg false 
patients per category is provided) and pan-cancer using dNdScv. Gene names discovery rate correction) is indicated by the intensity of shading. 

marked in orange are also significant ina previous study™, but not found inthe 
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Extended Data Fig. 9 | Oncogenic hotspots. Count of driver point mutations 
by variant type. Known pathogenic mutations curated from external databases 
are categorized as hotspot mutations. Mutations within five bases of aknown 
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Extended Data Fig. 10 | Driver co-occurrence. a, Mutated driver gene pairs 
that are significantly positively (right) or negatively (left) correlated in 
individual tumour types (number of independent samples per tumour type is 
indicated in Fig. 1) sorted by q value (Fisher exact test adjusted for false 
discovery rate). Pairs of genes onthe same chromosome that are frequently co- 
amplified or co-deleted by chance are excluded from positively correlated 
results. The 20 significant findings include previously reported co-occurrence 
of mutated DAX-MEN1 in pancreatic NET (q=7 10™*), and CDH1I-SPOPin 
prostate tumours (q=5 x10‘), as well as negative associations of mutated genes 
within the same signal transduction pathway such as KRAS-BRAF (q=4 x10) 
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and KRAS-NRAS (q=0.008) in colorectal cancer, BRAF-NRAS in skin cancer 
(q=6*10-?), CDKN2A-RB1 in lung cancer (q=8 x 10°) and APC-CTNNB1 in 
colorectal cancer (¢=3 x10). APCis also strongly negatively correlated with 
both BRAF (q=9 x 10°) and RNF43(q=4 x10), which together are characteristic 
of the serrated molecular subtype of colorectal cancers”. SMAD2-SMAD3 are 
highly positively correlated in colorectal cancer (q=0.02), which supportsa 
previous report ina large cohort of colorectal cancers”. In breast cancer, we 
found several novel relationships, including a positive relationship for GATA3- 
VMP1(q=6 x10) and FOXA1-PIK3CA (q=3 10°), and anegative relationship for 
ESR1-TP53 (q=9 x10“) and GATA3-TP53 (q=5*10°). 
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Statistical parameters 


When statistical analyses are reported, confirm that the following items are present in the relevant location (e.g. figure legend, table legend, main 
text, or Methods section). 


n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


An indication of whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


A full description of the statistics including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND 
variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
Give P values as exact values whenever suitable. 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Clearly defined error bars 
State explicitly what error bars represent (e.g. SD, SE, Cl) 
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Data collection No software was used for data collection 


Data analysis All analyses are based on open source software, which is available from third parties or developed by Hartwig Medical Foundation and 
available on GitHub (https://github.com/hartwigmedical/). The table below lists all external and internally developed software/tools, 
versions used and public links to the source code. 


External software/tools: 

bcl2fastq 2.17 to 2.20 http://sapac.support.illumina.com/downloads/bcl2fastq-conversion-software-v2-20.html 
BWA-mem 0.7.5a https://github.com/Ih3/bwa 

Sambamba 0.6.5 https://github.com/biod/sambamba/releases/tag/v0.6.5 

Picard 1.141 https://broadinstitute.github.io/picard/ 

GATK 3.4.46 https://software.broadinstitute.org/gatk/download/auth?package=GATK-archive&version=3.4-46-gbc02625 
Strelka 1.0.14 https://github.com/Illumina/strelka 

mutationalPatterns 1.4.3 https://bioc.ism.ac.jp/packages/3.6/bioc/html/MutationalPatterns.html 

Manta 1.0.3 https://github.com/Illumina/manta 

STAR-fusion ?? https://github.com/STAR-Fusion/STAR-Fusion/releases 

Bioconductor CopyNumber package 1.24.0 http://bioconductor.org/packages/release/bioc/html/copynumber.html 
ASCAT 2.52 https://github.com/Crick-CancerGenomics/ascat 

dNdScv 0.1.0 https://github.com/im3sanger/dndscv/releases/tag/0.1.0 


Circos 0.69.6 http://circos.ca/distribution/circos-0.69-6.tgz 

samtools 1.2 https://github.com/samtools/samtools/releases/tag/1.2 

snpeff 4.3s https://sourceforge.net/projects/snpeff/files/snpEff_v4_3s_core.zip/download 
vcftools 0.1.14 https://vcftools.github.io/index.html 

bcftools 1.9 https://github.com/samtools/bcftools/releases/download/1.9/bcftools-1.9.tar.bz2 


HME internal software/tools: 

Strelka_post_process 1.4 https://github.com/hartwigmedical/hmftools/releases/tag/strelka-post-process-v1-4 
HMF pipeline v3.0 https://github.com/hartwigmedical/pipeline/releases/tag/v3.0 

SAGE 1.1 https://github.com/hartwigmedical/hmftools/releases/tag/sage%E2%80%94v1-1 

BPI 1.5 https://github.com/hartwigmedical/hmftools/releases/tag/bpi-v1-5 

PURPLE 2.10 https://github.com/hartwigmedical/hmftools/releases/tag/purple-v2-10 

Amber 1.5 https://github.com/hartwigmedical/hmftools/releases/tag/amber-v1-5 

Cobalt 1.4 https://github.com/hartwigmedical/hmftools/releases/tag/cobalt-v1-4 

healthchecker 2.1 https://github.com/hartwigmedical/hmftools/tree/master/health-checker 

R analysis suite 1.3 https://github.com/hartwigmedical/scripts/releases/tag/pancancerpaper-v1-3 
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Available data includes germline and tumor raw sequencing data (BAM files, including non-mapped reads), annotated somatic and germline variants (VCF files with 
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independent Data Access Board who also evaluates if the intended use of the data is compatible with the consent given by the patients and if there would be any 
applicable legal or ethical constraints. Upon formal approval by the Data Access Board, a standard license agreement which does not have any restrictions regarding 
Intellectual Property resulting from the data analysis needs to be signed by an official organisation representative before access to the data is granted. After 
approval, access to data is provided under a license model, with the only main restriction that the data can only be used for the research detailed in the original 
request. Raw data files will be made available through a dedicated download portal with two-factor authentication. 


Non-privacy sensitive somatic variants can also be browsed and explored through an open access web-based interface which can be accessed at http:// 
database. hartwigmedicalfoundation.nl/. 
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Life sciences study design 
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Sample size The metastatic tumor sample cohort described in the paper consists of 2520 independent samples from 2399 patients (including 121 repeat 
biopsies) collected in 41 hospitals (academic, teaching and general hospitals). No sample size calculations were performed as the main aim of 
the study was to build up a resource 


Data exclusions Samples that failed predefined QC criteria or with a tumor purity below 20% were excluded from all analyses and not included in the 2520 
sample cohort (see Extended Data Fig 1). The tumor purity threshold was defined after bioinformatic tool optimization and simulations with 
titration series of reference samples and validation experiments on selected cohort samples. 


Replication Independent repeat processing of raw data of the same sample results in the same variant call data 
Randomization — Not applicable as the primary goal of this study was to create a resource. The study and analyses do not include any experimental 


manipulation and only involved the collection of tissue and blood material, the generation of whole genome sequencing data and the 
collection of clinical data from medical records. 
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Blinding Not applicable as the primary goal of this study was to create a resource. The study and analyses do not include any experimental 
manipulation and only involved the collection of tissue and blood material, the generation of whole genome sequencing data and the 
collection of clinical data from medical records. 
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Unique biological materials 


Policy information about availability of materials 


Obtaining unique materials | Tumor biopsies are collected as part of two clinical studies and remaining material is deposited in local biobanks as described in 
the methods. Because of the nature of the material (very small amount), broad accessibility is not possible. 


Human research participants 


Policy information about studies involving human research participants 


Population characteristics All patient included where diagnosed with metastatic disease and considered fit enough to undergo an invasive core-needle 
biopsy and planned to start treatment. The median age is 63 years (range 18 - 89). The cohort includes 1221 female and 1178 
male subjects. Age and gender information of each patient is included in Supplementary Table 2. All patients were seen in 
hospitals in the Netherlands, including academic, teaching and general hospitals 


Recruitment Metastatic cancer patients were asked to participate in the studies in any of the 41 participating hospitals. Recruitment involved 
hundreds of medical specialists and research nurses which minimizes self-selection biases. Recruitment was independent on 
tumor type. An important requirement for participation was the ability to safely undergo a tumor biopsy. Health conditions and 
lesion site related risk could therefore have resulted in exclusion of patients. 
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KRAS is the most frequently mutated oncogene in cancer and encodes a key signalling 
protein in tumours’”. The KRAS(G12C) mutant has a cysteine residue that has been 
exploited to design covalent inhibitors that have promising preclinical activity® >. Here 
we optimized a series of inhibitors, using novel binding interactions to markedly 
enhance their potency and selectivity. Our efforts have led to the discovery of AMG 510, 
which is, to our knowledge, the first KRAS(G12C) inhibitor in clinical development. In 
preclinical analyses, treatment with AMG 510 led to the regression of KRAS““ tumours 


and improved the anti-tumour efficacy of chemotherapy and targeted agents. In 
immune-competent mice, treatment with AMG 510 resulted ina pro-inflammatory 
tumour microenvironment and produced durable cures alone as well as in combination 
with immune-checkpoint inhibitors. Cured mice rejected the growth of isogenic 
KRAS©°”” tumours, which suggests adaptive immunity against shared antigens. 
Furthermore, in clinical trials, AMG 510 demonstrated anti-tumour activity in the first 
dosing cohorts and represents a potentially transformative therapy for patients for 
whom effective treatments are lacking. 


The KRAS oncoprotein is a GTPase and an essential mediator of intracel- 
lular signalling pathways that are involved in tumour cell growth and sur- 
vival!, Innormal cells, KRAS functions as a molecular switch, alternating 
between inactive GDP-bound and active GTP-bound states®”. Transition 
between these states is facilitated by guanine nucleotide-exchange fac- 
tors—which load GTP and activate KRAS—and GTP hydrolysis, which is 
catalysed by GTPase-activating proteins to inactivate KRAS”. GTP binding 
to KRAS promotes binding of effectors to trigger signal transduction 
pathways including the RAF-MEK-ERK (MAPK) pathway®”. Somatic, 
activating mutations in KRAS area hallmark of cancer and prevent the 
association of GTPase-activating proteins, thus stabilizing effector 
binding and enhancing KRAS signalling”®. Although there are clinically 
approved inhibitors of several MAPK pathway proteins (for example, 
inhibitors of MEK, BRAF and EGFR) for asubset of tumour types, to date 
there have been no clinical molecules that are selective for KRAS-mutant 
tumours. Moreover, several MAPK-pathway-targeting therapies are con- 
tra-indicated for treatment of KRAS-mutant tumours owing to alack of 
clinical efficacy". Additionally, non-tumour or non-mutant selective 
therapies canintroduce on-target toxicities due to the inhibition of MAPK 
signalling innormal cells”. This might limit the ability to combine such 
agents with standard-of-care treatments or immunotherapy. Thus, there 


isa considerable unmet need for the development of tumour-selective 
therapies that do not introduce liabilities for normal cells. 

KRAS©”“is present in approximately 13% of lung adenocarcinoma, 3% 
of colorectal cancer and 2% of other solid tumours". The mutant cysteine 
of KRAS(G12C) resides adjacent to a pocket (P2) that is present in the 
inactive GDP-bound form of KRAS’*. The proximity of P2 and the mutant 
cysteine led toa broad search for covalent inhibitors, eventually result- 
ing inthe identification of ARS-1620°*. This preclinical tool compound 
was a milestone for proof-of-concept, mutant-selective KRAS inhibi- 
tion’. We identified a series of novel acrylamide-based molecules that 
utilize a previously unexploited surface groove in KRAS(G12C) to sub- 
stantially enhance potency and selectivity. Intensive electrophile screen- 
ing and structure-based design culminated in the discovery of AMG 
510, whichis, to our knowledge, the first KRAS(G12C) inhibitor to reach 
clinical testing in humans (clinicaltrials.gov identifier NCTO03600883)"*. 
Here we present the data onthe preclinical activity of AMG 510, its abil- 
ity to induce tumour-cell killing as monotherapy or when combined 
with other therapies, and the marked impact of AMG 510 onimmune 
cell infiltration, which renders the tumour microenvironment highly 
sensitive to immunotherapy. We also present promising evidence for 
clinical efficacy. 
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Fig.1| AMG 510 exploits a cryptic groove in KRAS(G12C) to enhance potency 
and selectivity. a, X-ray co-crystal structure of KRAS(G12C/C51S/C80L/C118S) 
bound to GDP and AMG 510 at aresolution of 1.65 A. Cyan dashes, van der Waals 
contacts; white dashes, water-mediated interactions; yellow dashes, ligand- 
protein hydrogen bond interactions (PDB: 601M). b, Inhibition of p-ERK and 
occupancy of KRAS(G12C) by AMG 510 after a2-htreatment. Data are 
meants.d.,n=3replicates.c, Kinetic properties as determined by inhibition of 
p-ERK. k,,,and standard error of the curve were determined from nonlinear 


Enhanced binding and potency of AMG 510 


Direct inhibition of KRAS(G12C) was validated by ARS-1620, but the 
identification of improved inhibitors suitable for clinical testing has 
proven difficult. One key challenge is suboptimal potency owing to the 
small volume of the pocket occupied by ARS-1620, which offers limited 
avenues for additional protein-ligand interactions. This was illustrated 
by the X-ray crystal structure of the KRAS(G12C)-ARS-1620 covalent 
complex (Extended Data Fig. 1a), in which hydrogen bonding between 
ARS-1620 and His95 featured prominently. Our key breakthrough was 
the discovery that a surface groove, created by an alternative orienta- 
tion of His95, could be occupied by aromatic rings, which enhanced 
interactions with the KRAS(G12C) protein”. AMG 510 emerged as the 
top candidate from an optimization campaign of His95 groove-binding 
molecules, as it represented the convergence of improved potency and 
favourable development properties. The X-ray co-crystal structure of 
the covalent AMG 510-KRAS(G12C) complex (Fig. 1a and Extended Data 
Table 1) highlighted the binding of AMG 510 in the P2 pocket of KRAS. 
Although portions of the AMG 510 and ARS-1620 ligands are structurally 
related and overlap (Extended Data Fig. 1b), the His95 groove is a novel 
feature of the binding of AMG 510 (Fig. 1a and Extended Data Fig. 1b). 
The highly optimized isopropyl-methylpyridine substituent of AMG 
510 that occupied the His95 groove engaged in a continuous network 
of 25 ligand-protein van der Waals contacts extending from the back- 

bone of helix 2 (His95, Tyr96) to the backbone of the flexible switch 

Il loop (Fig. 1a). These enhanced interactions improved the potency 
of AMG 510 approximately 10-fold (mean half-maximum inhibitory 

concentration (IC;9) =0.09 LM), as compared to ARS-1620 ina nucleo- 

tide-exchange assay with recombinant GDP-bound KRAS(G12C). AMG 

510 did not inhibit wild-type KRAS and a non-reactive analogue did 

not inhibit KRAS(G12C) (Extended Data Fig. Ic, d). The kinetics of the 
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curve fitting of experimental values. d, e, Cellular activity of AMG 510 acrossa 
panel of KRAS“”¢ and non-KRAS© mutant cell lines measured as the inhibition 
of p-ERK after a2-h treatment (d) and the effects on cell viability after a72-h 
treatment (e). Representative examples of the data are shown (see 
Supplementary Table 1 for the number of replicates). f, Cysteine proteome 
analysis of NCI-H358 whole-cell lysates after a4-h treatment with 1 4MAMG510 
or DMSO. n=5 independent replicates, Pvalues were derived froma two-tailed 
Student’s t-test. 


reaction between AMG 510 and GDP-KRAS(G12C) were measured by 
mass spectrometry and exhibited a marked improvement compared 
to ARS-1620 (Extended Data Fig. le, f). Relative to cysteine-targeted 
kinase inhibitors in the clinic'’, AMG 510 exhibited a larger maximal 
rate of inactivation (K,nac:), consistent with the KRAS-induced catalysis 
mechanism that has previously been described for ARS-1620". The 
non-specific reactivity of AMG 510 with glutathione was relatively slow 
(ty. =196 min)” and within the range of clinical acrylamides”’. 


AMG 510 inhibits signalling and growth 


The cellular activity of AMG 510 was assessed by measuring basal phospho- 
rylation of ERK1/2 (p-ERK) and by mass spectrometry to detect the covalent 
conjugation or occupancy of KRAS(G12C) by AMG510. Intwo KRAS“< cell 
lines, NCI-H358 and MIA PaCa-2, AMG 510 almost completely inhibited 
p-ERK (IC,,~ 0.03 pM) after a2-h treatment and was 20-fold more potent 
than ARS-1620 (Extended Data Fig. 1g). This inhibition closely tracked the 
occupancy of KRAS(G12C) by AMG 510, with near maximal levels achieved 
inbothassays at around 0.2 uM (Fig. 1b). AMG 510 also potently impaired 
cellular viability in both NCI-H358 and MIA PaCa-2 (IC;, = 0.006 LM and 
0.009 uM respectively, approximately 40-fold more potentthan ARS-1620; 
Extended Data Fig. 1h). Examining thetimeand concentration dependence 
of the inhibition of p-ERK in these lines revealed a kinetic advantage that 
favoured AMG 510 by approximately 22-fold (Fig. 1c and Extended Data 
Fig. 1f). The maximal inhibition rate of p-ERK by AMG 510 is approximately 
twofold greater than the rate-limiting GTP-KRAS(G12C) hydrolysis rate 
that has recently been proposed‘. To estimate the GTPase rate by another 
method, we used a SHP2 inhibitor” to eliminate all upstream signalling 
to KRAS, which yielded a rate (9.4 x 10* $s", t.=12.2 min; Extended Data 
Fig. 2a) that was congruent with what was observed for AMG 510. 
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For further evaluation of the signalling effect of KRAS(G12C) inhi- 
bition, two cell lines were treated with a titration of AMG 510 for 4 or 
24h, and signalling nodes were analysed by immunoblot (Extended 
Data Fig. 2b). The KRAS species shifted mobility upon the formation of 
covalent adducts with AMG 510 and accumulated with increasing time 
and dose, consistent with downstream inhibition of the MAPK pathway 
(that is, p-MEK1/2 and p-ERK1/2) in both cell lines (Extended Data Fig. 2b). 
KRAS(G12C) inhibition by AMG 510 also led to an accumulation of active 
EGFR (p-EGFR(Y1068)). Inhibition of AKT phosphorylation (p-AKT) was 
apparent in one cell line, whereas a decrease in S6 phosphorylation 
(p-S6) and an increase in cleaved caspase-3 were observed at 24 hin 
both lines, suggesting induction of apoptosis. In time course studies, 
treatment with AMG 510 at 0.1 1M (Extended Data Fig. 2c) elicited rapid 
(<2 h) and sustained (>24 h) effects on MAPK and EGFR pathway signal- 
ling, whereas p-S6 and caspase cleavage emerged 8-16 hafter treatment 
in both lines. To assess activity and selectivity, AMG 510 was profiledin 
22 celllines that had heterozygous or homozygous KRAS““, KRAS muta- 
tions other than KRAS“ or wild-type KRAS. Treatment with AMG510 for 
2 hshowed that basal p-ERK was inhibited in all KRAS© cell lines, with 
IC,, values ranging from 0.010 pM to 0.123 LM (Fig. 1d and Supplemen- 
tary Table 1). AMG 510 did not inhibit p-ERK in any of the non-KRAS©”“ 
lines (IC;, > 10 iM; Fig. 1d and Supplementary Table 1). In cell-viability 
assays, AMG 510 impaired the growth of all KRAS“” cell lines, except 
SW1573, with IC; values ranging from 0.004 LM to 0.032 uM (Fig. le 
and Supplementary Table 1). Non-KRAS“™ lines were insensitive to AMG 
510 (IC;, > 7.5 1M; Fig. le and Supplementary Table 1). As reported for 
other KRAS(G12C) inhibitors**, spheroid growth conditions enhanced 
the sensitivity of most tested lines to AMG 510 (Extended Data Fig. 2d 
and Supplementary Table 1). To further determine the selectivity of the 
covalent interaction of AMG 510 with KRAS(G12C) and to identify other 
potential ‘off-target’ cellular proteins, cysteine-proteome profiling by 
mass spectrometry was performed as previously described*. After 4-h 
treatment with DMSO or 14.MAMGS510 (>30-fold above p-ERK IC,.), the 
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cysteine proteome was enriched and peptides were identified. Among 
6,451 unique cysteine-containing peptides, the Cys12 peptide from 
KRAS(G12C) was the only peptidethat met the criteria for covalent target 
engagement’ (Fig. If and Supplementary Table 2). 

The effect of AMG 510 treatment on KRAS(G12C) signalling in vivo 
was evaluated in pharmacodynamics assays in which p-ERK was meas- 
ured. In three KRAS°”* tumour models, AMG 510 inhibited p-ERK in 
a dose-dependent manner 2 h after treatment (Fig. 2a and Extended 
Data Fig. 3a, b) and maximal inhibition was observed at 30-100 mg kg. 
Time-course pharmacodynamics assays demonstrated peak plasma and 
tumour exposure of AMG 510 0.5 hafter a single dose (10 mg kg”), lead- 
ing to maximal inhibition of p-ERK 2-4 h after treatment and sustained 
inhibition for 48 h (Extended Data Fig. 3c, d). This was consistent with 
covalent inhibition of the long-lived KRAS(G12C) protein (t,.~20-24 h; 
Extended Data Fig. 3e). Occupancy of KRAS(G12C) by AMG 510 was also 
measured by mass spectrometry and approached 100% at 100 mg kg", 
correlating with maximal suppression of p-ERK (Fig. 2b and Extended 
Data Fig. 3f). Time-course studies indicated that occupancy was detected 
by 0.5 h and maximal at 2 h (Extended Data Fig. 3g). 


Mutant-selective tumour inhibition in vivo 

In mice with xenografts of humantumour cells, AMG 510 significantly 
inhibitedthe growth of MIA PaCa-2 T2and NCI-H358tumoursatall doses, 
and regression of tumours was observed at higher doses (Fig. 2c, d). 
The dose of AMG 510 that was required to achieve the regression of MIA 
PaCa-2 T2 tumours was at least 3.3-fold lower than ARS-1620 (Extended 
Data Fig. 4a). Plasma exposures above the cellular ICyo of p-ERK for more 
than 2hresulted in tumour regression (Extended Data Fig. 4b, c). AMG 
510 also inhibited the growth of KRAS©”~-mutant patient-derived xeno- 
grafts (Fig. 2e and Extended Data Fig. 4d). By contrast, AMG 510 treat- 
ment had no effect on KRAS“”” tumour growth (Extended Data Fig. 4e). 
Inimmune-competent mice, AMG 510 resulted in regression of CT-26 
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Fig. 3 | Clinical activity of AMG 510 in patients with lung cancer in first-in- 
human dose-escalation study. a, Study design. b, Computed tomography 
scans of patients with KRAS“”“ lung carcinoma treated with AMG 510 (left, 

180 mg; right, 360 mg). Representative pre-treatment (baseline) and post- 
treatment (R,) scans. Lesions are outlined by a red outline or highlighted by red 
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45.5% 


KRAS°“ tumours, a mouse syngeneic tumour model generated by 
CRISPR technology (Fig. 2f). Two of the ten miceinthe100 mgkg‘group 
had no detectable tumours at the end of the study (day 29). However, 
regression of the tumours lacked durability (Fig. 2g), possibly owing to 
incomplete inhibition of p-ERK (Extended Data Fig. 3b). Therefore, adose 
of 200 mg kg“ of AMG 510 was evaluated, resulting in near-complete 
inhibition of p-ERK (Extended Data Fig. 3b) and durable cures in eight 
out of ten mice (Fig. 2h), in which AMG 510 plasma levels were just below 
the cellular IC,, (Extended Data Fig. 4f). Intriguingly, inthe same tumour 
model but in mice that lacked T cells, AMG 510 induced regression but 
not cures, suggesting that the immune system drives cures in immune- 
competent mice (Extended Data Fig. 4g). 


Evidence of clinical activity 


The enhanced potency and efficacy of AMG 510 prompted its selection 
as, to our knowledge, the first KRAS(G12C) inhibitor to enter clinical 
trials (clinicaltrials.gov identifier NCTO3600883)"°. AMG510 was admin- 
istered orally, once daily, in escalating dosing cohorts (Fig. 3a). In the 
first two dosing cohorts there were four patients with non-small-cell 
lung carcinoma (180 mg, n=3;360 mg,n=1). Treatment with AMG 510 
resulted in objective partial responses (as per RECIST 1.1) intwo patients 
(Fig. 3b and Extended Data Fig. 5) and stable disease in two patients. The 
two patients witha partial response had progressed on multiple previous 
systemic treatments including carboplatin, pemetrexed and nivolumab 
with documented disease progression. After 6 weeks of treatment with 
AMGS510, the first responder (180 mg) exhibited tumour shrinkage of 
34%, and the second (360 mg) exhibited a tumour reduction of 67%. A 
follow-up scan at 18 weeks revealed complete resolution of target lesions 
inthe second responder. AMG510 exposures in both patients were above 
the cellular IC,, of p-ERK (165 nM in MIA PaCa-2; Extended Data Fig. 4b) 
for 24h (Fig. 3c). These patients remain active on AMG 510 treatment 
with the durations of 42 and 29 weeks, respectively, as of the cut-off 
date for the present data. We show that these patients responded toa 
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arrows. Left images show the lower-right lobe of the lungs (top), upper-left lobe 
of the lungs (middle) and lymph node (bottom). Right images show the upper 
left lobe of the lungs (top) and pleura (middle and bottom). Lesions in the 
18-week scans of the patient who received 360 mg AMG 510 were considered too 
small to accurately measure. c, Pharmacokinetic data from the two responders. 


mutant-specific KRAS inhibitor, representing a milestone for patients 
with KRAS©”“-mutant cancer. 


AMG 510 improves efficacy of targeted agents 


The clinically validated strategy of combining BRAF and MEK inhibi- 
tors” suggests that the combinations of AMG 510 and other inhibitors 
in the MAPK (and AKT) signalling pathways might enhance tumour- 
cell killing and overcome resistance. Therefore, in vitro combination 
experiments were conducted in several KRAS“ cell lines with matrices 
of AMG 510 and inhibitors of HER kinases, EGFR, SHP2, PI3K, AKT and 
MEK (Extended Data Fig. 6a and Supplementary Table 3). As suggested 
by the induction of p-EGFR by AMG 510 (Extended Data Fig. 2b), the 
combination of AMG 510 with multiple agents resulted in synergis- 
tic killing”? of NCI-H358 tumour cells (Fig. 4a, Extended Data Fig. 6a 
and Supplementary Table 3). Synergy was more limited in other lines, 
but the combination with a MEK inhibitor was synergistic in multiple 
settings and was enhanced in spheroid growth conditions (Fig. 4a). 
Significantly enhanced anti-tumour activity was also observed in vivo 
witha minimally efficacious dose of AMG510 in combination with a MEK 
inhibitor, when compared to either of the single agents alone (Fig. 4b). 
These data suggest that the clinical combination of AMG 510 with MAPK 
inhibitors might eliminate bypass or residual signalling that could limit 
efficacy or induce resistance. 

Given the prevalence of KRAS°”¢ in lung adenocarcinoma, a 
combination treatment of AMG 510 with carboplatin, a standard- 
of-care chemotherapeutic, was investigated. Treatment with either 
AMG 510 or carboplatin resulted in significant inhibition of NCI-H358 
tumour growth in mice (Fig. 4c). However, combination treatment at 
various doses resulted in significantly improved anti-tumour efficacy 
(Fig. 4c and Extended Data Fig. 6b). The demonstration of enhanced 
efficacy of the combination of a mutant-selective KRAS inhibitor and 
a chemotherapeutic agent provides rationale for this approach in 
the clinic. 
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Fig. 4| AMG 510 combined with cytotoxic or targeted agents results in 
enhanced efficacy. a, Synergy scores for AMG 510 combinations with targeted 
agents represented as a heat map, with higher scores (darker red) denoting 
stronger synergy. b, AMG 510 in combination witha MEK inhibitor (PD-0325901). 
c, AMG510incombination with carboplatin. d, CT-26 KRAS@” tumour growthin 
individual mice. Lines with circles indicate tumour-free mice. e, Kaplan-Meier 
analysis of survival end point (tumour size > 800 mm’).b, c, Dataare 


AMG 510 synergizes with immunotherapy 


Blockade of the immune checkpoint axis that involves programmed cell 
death1(PD-1)-programmed death ligand 1(PD-L1) is clinically validated 
in multiple settings. As the long term cures induced by AMG 510 in the 
CT-26 KRAS“ model were dependent onthe engagement of theimmune 
system (Fig. 2h and Extended Data Fig. 4g), strategies such as anti-PD-1 
therapy that further boost anti-tumour T cell activity may synergize with 
AMGSI10. The CT-26 KRAS““‘ model is dependent on the KRAS“ allele 
(Extended Data Fig. 7a, b) and is sensitive to AMG 510 treatment (Fig. 2f 
and Extended Data Fig. 3b). Furthermore, its parental line CT-26 has been 
broadly used to evaluate the effects of immunotherapy as wellas combina- 
tions of immunotherapeutic and targeted agents °. Therefore, we used 
this model to evaluate the combination of anti-PD-limmune checkpoint 
inhibition with AMG 510, which was administered at a suboptimal dose 
to enable the evaluation of combination effects. As shown above (Fig. 2f, 
g),AMG510 caused tumour regression in mice as asingle agent (Fig. 4d), 
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mean +s.e.m.,n=10 mice per group; ***P< 0.001 for combination treatment 
compared with the single agent; ****P< 0.0001 for treatment group compared 
with vehicle; Pvalues were determined by repeated-measures ANOVA followed 
by Dunnett’s multiple-comparison test; *P< 0.001 regression by two-sided 
Student’s t-test. e, n=10 mice per group; ****P< 0.0001 compared with vehicle; 
*P<0.001 combination versus AMG 510 or anti-PD-1 determined by two-sided 
Mantel-Cox test. 


but only one out of tentumours remained completely regressed (Fig. 4d). 
Anti-PD-1 monotherapy delayedtumourgrowth, withcompleteregression 
inonly one oftentumours. Notably, combined treatment led to complete 
responses in nine out often mice (Fig. 4d). Treatment was stopped after 
day 43, andall complete responders remained cured 112 days later. Using 
asurrogate end point (tumour volume >800 mm?), the combined treat- 
ment significantly improved survival (Fig. 4e). 

To understand the effects of treatment on immune cell composition, 
CT-26 KRAS“ tumours were immunophenotyped. After 4 days of treat- 
ment, AMG 510 markedly increased the infiltration of T cells, primar- 
ily CD8* T cells, into the tumour (Fig. 5a and Extended Data Fig. 8a). 
Increased infiltration of CD8* T cells was also observed in the combina- 
tion group, but not after anti-PD-1 monotherapy. Immunohistochemical 
analysis also revealed an increased number of total and proliferating 
CD3*T cells and total CD8* T cells after AMG 510 treatment, which were 
further increased after the combination treatment (Fig. 5b, c). Asan 
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additional comparison we used a MEK inhibitor, which blocked MAPK 
signalling downstream of RAS (Extended Data Fig. 8b). This inhibitor 
regressed CT-26 KRAS@”* tumours in mice toa similar level as AMG 510 
(Extended Data Fig. 8c, d), but did not significantly affect the numbers 
of infiltrating CD8" T cells (Fig. 5a). AMG 510 treatment also increased 
the infiltration of macrophages and dendritic cells, including CD103* 
cross-presenting dendritic cells, which are critical for T cell priming and 
activationand areimplicated inT cell recruitment” (Fig. 5aand Extended 
Data Fig. 8a). PD-1 expression on CD8* T cells was moderately increased 
by both AMG S510 and the MEK inhibitor (Extended Data Fig. 8a). 

Tumour RNA was purified after 2 days of treatment for transcriptional 
profiling of a panel of immune-associated genes. AMG 510 induced a 
pro-inflammatory microenvironment characterized by increased inter- 
feron signalling, chemokine production, antigen processing, cytotoxic 
and natural killer cell activity, as well as markers of innate immune sys- 
tem stimulation, that were significantly higher compared to the effects 
induced by MEK inhibition (Fig. 5d and Extended Data Fig. 8e). Infiltra- 
tion ofimmune cells was correlated with increased expression of several 
chemokines including Cxcl11 (Extended Data Fig. 8e and Supplementary 
Table 4). To examine whether these immune-enhancing effects were 
directly attributable to AMG 510, CT-26 KRAS“ cells were treated with 
AMG510in vitro and the expression of immune genes was measured. AMG 
510 induced expression of Cxcl10 and Cxcl11 (Extended Data Fig. 9a), which 
are key attractants of tumour-suppressive immune cells”””*. This provides 
a potential mechanistic link by which AMG 510 treatment increases the 
intratumoral concentration of chemokines, leading to the infiltration of 
T cells and dendritic cells and improved immunosurveillance. 
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Previous data suggested that although MEK inhibition could 
promote anti-tumour activity in combination with anti-PD-L1 treatment 
in vivo, it can also inhibit T cell function”. Using an in vitro co-culture 
system with mouse bone marrow-derived dendritic cells andtransgenic 
CD8°* T cells, MEK inhibition impaired antigen-specific T cell prolifera- 
tion, whereas AMG 510 did not affect the T cell response (Extended 
Data Fig. 9b). Furthermore, AMG 510 induced expression of MHC class 
antigens on CT-26 KRAS“” tumour cells in vitro (Fig. 5e and Extended 
Data Fig. 9c). These data suggest that AMG 510 treatment leads to 
increased T cell priming, antigen recognition of tumour cells and the 
potential establishment of long-term anti-tumour T cell responses. 
To test this, mice that were cured by the combined treatment of AMG 
510 and anti-PD-1 (Fig. 4d) were rechallenged with bilateral tumours 
of CT-26 KRAS“ and parental CT-26 (KRAS©””) cells, or CT-26 KRAS&“6 
and an unrelated mouse breast tumour model, 4T1. All 4T1 tumours 
(four out of four) grew, but none of the CT-26 KRAS“”* tumours 
(zero out of eight) or CT-26 parental tumours (zero out of four) became 
established (Fig. 5f). In acontrol group of naive mice, all parental CT-26 
and CT-26 KRAS“”“ tumours grew (15 out of 15; Extended Data Fig. 9d). 
Splenocytes collected fromthe cured mice were stimulated with CT-26, 
CT-26 KRAS“ or 4T1 tumour cells, and we measured secreted IFNy as 
amarker of tumour-specific T cell priming and activity. CT-26 KRAS©”° 
cells and parental CT-26 cells caused nearly a threefold increase in 
IFNy, which was not induced by 4T1 cells (Fig. 5g). Together, these 
data suggest that the combination of AMG 510 and anti-PD-1 therapy 
prompted the establishment of long-term tumour-specific T cell 
responses. 


Discussion 


The discovery of the interaction with the His95 groove of KRAS(G12C) 
enabled markedly increased potency and the identification of AMG 
510, a first-in-class oral KRAS(G12C) inhibitor with evidence of clinical 
activity in patients with KRAS© mutant cancer. Preclinically, AMG 510 
selectively targeted KRAS“”* tumours, caused durable regression as 
a monotherapy, and could be combined with cytotoxic and targeted 
agents to synergistically kill tumour cells. AMG 510 treatment led to 
an inflamed tumour microenvironment that was highly responsive 
to immune-checkpoint inhibition. Combined treatment of anti-PD-1 
therapy and a MEK inhibitor has shown preclinical efficacy in several 
reports”*?°3° and this was associated with increased T cell infiltration. 
Inthe present study, significantly greater immune cell infiltration was 
observed after selective KRAS(G12C) inhibition compared to the MEK 
inhibitor. In contrast to the reported effects of non-tumour-selective 
MEK inhibition, which blocks T cell expansion and priming”, selec- 
tive inhibition of KRAS(G12C) by AMG 510 resulted in increased T cell 
infiltration and activation. Furthermore, the combination of AMG 510 
and anti-PD-1 therapy established a memory T cell response against 
both the CT-26 KRAS“” cells and the parental CT-26 tumour cells. These 
data support a model of enhanced antigen recognition and T cell mem- 
ory in which AMG 510-induced tumour cell death and innate immune 
responses, combined with anti-PD-1 treatment, results in an adaptive 
immune response that can recognize and eradicate related but non- 
KRAS°“ tumours. There is ample evidence that the intratumoral KRAS 
mutation status can be heterogeneous within the same tumour and 
between primary and metastatic sites” *. Taken together, our data 
suggest that AMG 510 might be an effective anti-tumour agent evenin 
settings in which KRAS“” expression is heterogenous. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 
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Extended Data Fig. 1| Enhanced binding of AMG 510 to KRAS(G12C) results 
inimproved properties. a, X-ray co-crystal structure of KRAS(G12C/C51S/ 
C80L/C118S) bound to GDP and ARS-1620 (PDB: 5V9U). b, Overlay of ARS-1620 
and AMGS510. The right side shows different orientations of His95 (H95) 
depending on the ligand. c, Biochemical activity of AMG 510 and ARS-1620 ina 
nucleotide-exchange assay with purified KRAS(G12C/C118A) or KRAS(C118A) 
protein. Dataare mean+s.d.,n=4 replicates. The wild-type cysteine at position 
118 was changed to alanine to avoid reactivity with non-Cys12 cysteines. 

d, Biochemical activity of AMG 510 andits non-reactive propionamide analogue 


inanucleotide-exchange assay with purified KRAS(G12C/C118A); propionamide, 
mean of n=2 replicates. e, Kinetic properties of AMG 510 and ARS-1620 as 
determined by mass spectrometry. f, Calculated maximal reaction rates (Kjnace OF 
k.ps) and the concentrations that achieve a half-maximal rate (K, or [/]5.) of AMG 
510 and ARS-1620.e, f, k,,;, K,, [/]s) and standard error of the curve were 
determined from nonlinear curve fitting of experimental values. g, h, Inhibition 
of p-ERK after a 2-h treatment (g; mean, n=2 replicates) and effects oncell 
viability after 72-h treatment (h; mean+s.d.,n=3 replicates) with AMG 510 or 
ARS-1620. 
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Extended Data Fig. 3 | AMG 510 covalently modifies KRAS(G12C) intumours 
and inhibits signalling in vivo. a~d, Mice bearing MIA PaCa-2 T2 (a, c, d) or CT- 
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**P<0.01 compared with vehicle; one-way ANOVA followed by Dunnett’s 
multiple-comparison test. e, Half-life determination of KRAS(G12C) in MIA 
PaCa-2 and NCI-H358 cells by SILAC. Data are mean+s.d.,n=3 replicates. 


f,g, AMG 510 treatment results in covalent modification of KRAS(G12C) that 
inversely correlates with p-ERK inhibition in MIA PaCa-2 T2 tumours. Data are 
mean+s.d.,n=3 mice per group. 
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Extended Data Fig. 4 | AMG 510 inhibits tumour growth of patient-derived 
xenografts, and exposure to AMG 510 at or above cellular IC,, drives 
regression of xenografts. a, Mice bearing MIA PaCa-2 T2 tumours were treated 
with ARS-1620 at the indicated doses. Dataare mean +s.e.m.,n=10 mice per 
group; ****P< 0.0001, ***P< 0.001, *P< 0.05 compared with vehicle; repeated- 
measures ANOVA followed by Dunnett’s multiple-comparison test. *P< 0.05 
regression by two-sided Student’s t-test. b,c, Plasma levels of AMG510 from MIA 
PaCa-2 T2 or NCI-H358 xenografts. The dotted line represents the p-ERK ICg, 
values incells after treatment with AMG 510 for 2h. d, e, Effect of AMG510 
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Extended Data Fig. 5 | Clinical activity of AMG 510 in patients with lung 
cancer ina first-in-human dose-escalation study. Computed tomography 
scans of two patients with KRAS“” lung carcinoma who were treated with AMG 
510. Additional representative pre-treatment (baseline) and post-treatment (R,) 
scans of patients described in Fig. 3 (left, 180 mg; right, 360 mg). Lesions are 
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lobe. Right images show, from top to bottom, the lung lower left lobe, lung lower 
left lobe and adrenal gland. Lesions in the 18-week scans of the patient who was 

treated with 360 mg AMG S510 were considered too small to accurately measure. 
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Extended Data Fig. 7 | AMG 510 inhibits KRAS(G12C) signalling and viability except trametinib in CT-26, mean of n=2 replicates) and the effects on cell 
ofsyngeneic CT-26 KRAS“” cells. a, b, Cellular activity ofAMG510andtheMEK _ viability after 72-htreatment in spheroid culture (b; AMG510 in CT-26 KRAS©“, 
inhibitor trametinib in CT-26 KRAS“” and parental CT-26 cell lines as measured mean ts.d.,n=3 replicates; all others, mean of n=2 replicates). 
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Extended Data Fig. 8 | See next page for caption. 
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Extended Data Fig. 8 | AMG 510 treatment induces a pro-inflammatory 
tumour microenvironment. a, CT-26 KRAS“” tumours were 
immunophenotyped by flowcytometry. Data are mean +s.d.,n=8 mice per 
group; ****P< 0.0001, *P<0.05; NS, not significant; one-way ANOVA followed by 
Tukey’s multiple-comparison test. MEKi, MEK inhibitor. b, Mice bearing CT-26 
KRAS©”¢ tumours were treated orally witha single dose of vehicle (black bar) or 
with the indicated dose of MEK inhibitor (blue bar). Tumours were collected 2h 
later and levels of p-ERK were measured. MEK inhibitor concentrationsin plasma 
(red triangle) or tumours (black open circle). Data are mean+s.e.m.,n=3 mice 
per group; ****P< 0.0001 compared with vehicle; one-way ANOVA followed by 


Dunnett’s multiple-comparison test. c, Mice bearing CT-26 KRAS“ tumours 
were treated with MEK inhibitor at the indicated doses. Data are mean +Ss.e.m., 
n=8 mice per group; ****P< 0.0001 compared with vehicle; repeated-measures 
ANOVA followed by Dunnett’s multiple comparison test. d, Tumour volumes 
fromthe immunophenotyping study (a) of CT-26 KRAS“”“ tumour-bearing mice 
treated over 4 days. n=8 mice per group. e, RNA was isolated from CT-26 
KRAS©” tumours. n=5 mice per group. Gene expression and scores were 
calculated by nSolver v.4.0. Data are mean + s.d.; ****P< 0.0001, ***P< 0.001, 
**P<0.01,*P<0.05; NS, not significant; one-way ANOVA followed by Tukey’s test. 
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Extended Data Fig. 9 | See next page for caption. 
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Extended Data Fig. 9 | AMG 510 induces expression of chemokines and MHC 
class Il antigens in CT-26 KRAS“” cells. a, Quantification of Cxcl10 or CxclI1 
transcripts, as well as secreted CXCL10 (IP-10) protein, after 24-h treatment of 
parental CT-26 or CT-26 KRAS“” cells with AMG 510 or MEK inhibitor. Data are 
meants.d.,n=4 replicates; ****P< 0.0001, ***P< 0.001, **P< 0.01; NS, not 
significant; two-way ANOVA followed by Tukey’s multiple-comparison test. 


b, Ova-pulsed bone-marrow-derived dendritic cells and CellTrace Violet (CTV)- 


labelled OT-I CD8* T cells co-cultured with AMG 510 or MEK inhibitor. T cell 
proliferation was assessed by measuring CTV dilution in T cells. Left, T cells 


treated with mock (shaded), AMG 510 (solid line) or MEK inhibitor (dashed line) 
from arepresentative experiment. Right, data from four independent 
experiments were pooled and show the frequency of dividing T cells relative to 
mock treatment. Data are mean +s.e.m.;**P<0.01; NS, not significant; one-way 
ANOVA followed by Tukey’s multiple-comparison test.c, Cell surface expression 
of MHC class I antigens (H-2D4and H-2L*) on CT-26 KRAS@”¢ cells after 24-h 
treatment with AMG 510 with or without IFNy as measured by flowcytometry. 
d, Growth curves of either CT-26 or CT-26 KRAS@© tumours in BALB/c mice 
(n=15). 


Extended Data Table 1| Data collection and refinement statistics for AMG 510-KRAS(G12C) complex 


KRASG@!2C/C51S/C80L/C 118S 


AMG 510 (6OIM) 


Data collection 
Space group 
Cell dimensions 

a, b,c (A) 

a, B,y (°) 
Resolution (A) 
Rsym 
I/ol 
Completeness (%) 
Redundancy 


Refinement 

Resolution (A) 

No. reflections 

Rwork / Riree 

No. atoms 
Protein 
Ligand/ion 
Water 

B-factors 
Protein 
Ligand/ion 
Water 

R.m.s. deviations 
Bond lengths (A) 


P21 21 21 


40.87, 58.42, 65.89 
90, 90, 90 

30.0-1.65 (1.71-1.65) 
0.162 (0.521) 

6.9 (2.5) 

97.0 (96.3) 

4.4 (4.2) 


30.00 - 1.65 
18077 
0.1809 / 0.2152 
1613 

1336 

70 

207 

24.8 

24.3 

24.1 

34.1 


0.005 


Bond angles (°) 1.08 


One crystal dataset was collected for the X-ray co-crystal structure of the AMG 510-KRAS(G12C) covalent complex. Values in parentheses are for the highest-resolution shell. 
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Statistics 


For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section. 


n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 
Lo AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
Give P values as exact values whenever suitable. 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Our web collection on statistics for biologists contains articles on many of the points above. 


Software and code 


Policy information about availability of computer code 


Data collection B3 v3 (x-ray crystallography) 
EnVision Manager v1.13-1.14 (nucleotide exchange, viability, combination assays) 
Discovery Workbench v4.0 (p-ERK assays) 
ImageLab v4.1 and v5.2.1 (immunoblotting) 
Agilent RapidFire v4.0, MassHunter Workstation B.05.01 (mass spectrometry kinetic assay) 
nSolver v4.0 (NanoString) 
StudyDirector v3.1 (in vivo studies) 
BD FACSDiva v8.0.1(flow cytometry) 
Immunospot v2.6.1 (ELIspot) 
Analyst v1.6 (AMG 510-KRAS G12C conjugate detection, SILAC) 


Data analysis CCP4 Program Suite v6.4.0, HKL2000 v717, MolRep v11.2.08, Refmac5 v5.8.0073, Coot v0.7.2, PRODRG v050106.0517 (x-ray 
crystallography) 
Microsoft Excel for Office 365 (nucleotide exchange, viability, p-ERK, kinetic, ELlspot, cysteine proteomics, flow cytometry) 
GraphPad Prism v7.04 (nucleotide exchange, viability, p-ERK, kinetic, in vivo efficacy/survival, flow cytometry, NanoString, ELIlspot, SILAC) 
MassHunter Qualitative Analysis B.07.00 (mass spectrometry kinetic assay) 
Chalice Analyzer v1.5.0.71 (combination synergy scores) 
SEQUEST (cysteine proteomics) 
nSolver v4.0 (NanoString) 
Mathematica v11.3 (SILAC) 
Immunospot v2.6.1 (ELIspot) 
BD FACSDiva v8.0.1, FlowJo software v10 (flow cytometry) 
Biostatistical Analysis R Shiny application v1.0.5 (in vivo PKPD studies) 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. 
We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information 
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Data 


Policy information about availability of data 
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 


- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- A description of any restrictions on data availability 


The majority of data generated or analyzed during this study are included in this published article or available at the source data links. X-ray crystallographic 
coordinates and structure factor files have been deposited in the Protein Data Bank (PDB ID code: 601M). Other data that support the findings of this study are 
available from the corresponding authors. Qualified researchers may request data from Amgen clinical studies. Complete details are available here: http:// 
www.amgen.com/datasharing. 


Field-specific reporting 


Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection. 


Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences 


For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf 


Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size For in vivo PKPD studies, n=3/group were used. For efficacy studies, n=8-10 mice per group were used. Animal numbers for in vivo studies 
were selected using power analysis alpha 0.05 and 80% power such that a minimum change of 32-49% could be detected on the observed 
data scale. No sample size calculation was performed for in vitro studies. 


Data exclusions No data were excluded. 

Replication In vitro experiments were repeated the indicated number of times, with the exception of immunoblot experiments which were performed 
once. Synergy scores were determined from the aggregate of two 10x10 matrices for adherent monolayer combinations, but only one 6x10 
matrix for spheroid combinations. In vivo PKPD dose response studies (MIA PaCa-2 T2, CT-26 KRAS p.G12C) were repeated with similar results 
at least twice. The combination of AMG 510 with anti-PD-1 in CT-26 KRAS p.G12C, as well as the tumor growth measurements of untreated 


CT-26 parental and CT-26 KRAS p.G12C tumors, were repeated twice with similar results. All other in vivo studies were performed once. 


Independent repeats and sample sizes, as well as statistical analyses and significance levels, are also indicated in the Figure legends or in the 
Statistics and Reproducibility section. 


Randomization Sample randomization is not relevant to the in vitro studies presented. For in vivo studies, animals were evenly distributed such that each 
group had a similar mean and SEM at the start of the study. 


Blinding Treatment groups for the in vivo combination studies were blinded to the investigator. 


Reporting for specific materials, systems and methods 


We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, 
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 


Materials & experimental systems Methods 

n/a | Involved in the study n/a | Involved in the study 
Antibodies ChIP-seq 
Eukaryotic cell lines Flow cytometry 
Palaeontology MRI-based neuroimaging 


Animals and other organisms 


Human research participants 


Clinical data 


Antibodies 


Antibodies used Immunoblot: all antibodies were used at 1:1,000 dilution unless otherwise indicated. 
phospho-EGF Receptor (Tyr1068) (D7A5) XP® Rabbit mAb Cell Signaling #3777; Lot 13 
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EGF Receptor (D38B1) XP® Rabbit mAb Cell Signaling #4267; Lot 11 

Phospho-MEK1/2 (Ser217/221) Antibody Cell Signaling #9121; Lot 44 

EK1/2 Antibody Cell Signaling #9122; Lot 14 

hospho-S6 Ribosomal Protein (Ser235/236) (D57.2.2E) XP® Rabbit mAb Cell Signaling #4858; Lot 16 
6 Ribosomal Protein (5G10) Rabbit mAb Cell Signaling #2217; Lot 7 

hospho-Akt (Ser473) (D9E) XP® Rabbit mAb Cell Signaling #4060; Lot 19 

t Antibody Cell Signaling #9272, Lot 27 

hospho-ERK1/ERK2 (Thr185, Tyr187) Polyclonal Antibody ThermoFisher #44-680G; Lot SB248818 
44/42 MAPK (Erk1/2) Antibody Cell Signaling #9102; Lot 26 

nti-Ras antibody [EPR3255] Abcam #ab108602; Lot GR117071-23 

eaved Caspase-3 (Asp175) Antibody Cell Signaling #9661; Lot 45 

nti-B-Actin—Peroxidase Mouse mAb AC-15 Sigma #A3854; Lot 026M4820V; 1:20,000 

onkey Anti-rabbit IgG HRP GE Healthcare #NA934V; Lot 9677977; 1:5,000 


nv 


oD Ur, 


OULrap, 


ow cytometry: all antibodies were used at 1:100 unless otherwise indicated. 

E Mouse anti-mouse H-2Dd (34-2-12, Biolegend #110608, Lot B256526) 

E Mouse anti-mouse H-2Kd (SF1-1.1, Biolegend #116608, Lot B244820) 

E Mouse anti-mouse H-2Ld/H-2Db (28-14-8, Biolegend #114507, Lot B240332) 

UV737 rat anti-mouse CD4 (RM4-5, BD Biosciences #564933, Lot 8164630) 

421 rat anti-mouse CD8a (53-6.7, BD Biosciences #563898, Lot 7201962) 

UV737 rat anti-CD11b (M1/70, BD Biosciences #564443, Lot 7338572) 

/786 mouse anti-mouse CD45.2 (104, BD Biosciences #563686, Lot 8235903) 

PC-H7 rat anti-mouse Ly-6G (1A8, BD Biosciences #565369, Lot 8121728) 

711 hamster anti-mouse TCR B chain (H57-597, BD Biosciences #563135, Lot 7054698) 
FITC rat anti-mouse CD24 (M1/69, ThermoFisher #11-0242-81, Lot 1937898) 

PC hamster anti-mouse CD103 (2E7, ThermoFisher #17-1031-80, Lot 17-1031-80) 

E hamster anti-mouse CD279 (PD-1) (J43, ThermoFisher#12-9985-82, Lot 4329622) 
PC/Cy7 rat anti-mouse CD90.2 (30-H12, Biolegend #105328, Lot B241601) 

650 rat anti-mouse F4/80 (BMB8, Biolegend #123149, Lot B256505) 

711 rat anti-mouse Ly-6C (HK1.4, Biolegend #128037, Lot B247973) 

510 rat anti-mouse |-A/I-E (MHCII) (M5/114.15.2, Biolegend #107635, Lot B263357) 
PC rat anti-mouse CD8a (53-6.7, eBioscience #17-0081-82, Lot E07056-1635) 

rat anti-mouse CD16/CD32 (2.4G2, BD Biosciences #553142, Lot 4198965) 

FITC hamster anti-mouse TCR B chain (H57-597, BD Biosciences #553171, Lot 8351664) 
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Immunohistochemistry: 

Rat anti-human CD3 (mouse CD3 cross-reactive) (CD3-12, Bio-Rad #MCA1477, Lot 7708); 1:1,000 

Rabbit anti-mouse CD8a (D4W2Z, Cell Signaling Technology #98941, Lot 0712017); 1:500 

Rabbit anti-human Ki67 (mouse Ki67 cross-reactive) (SP6, Sigma-Aldrich #275R-1, Lot 45305); 1:500 

Rat IgG isotype negative control (Jackson Immunoresearch Labs #012-000-003, Lot 68714); 2 mcg/mL 
Rabbit IgG isotype negative control (Jackson Immunoresearch Labs #011-000-003, Lot 132409); 2 mcg/mL 
HRP-anti-rat-IlgG (Biocare Medical #BRR4016L, Lot 100317); undiluted 

HRP-anti-rabbit-lgG (Dako #K4003, Lot 10147964); undiluted 


Validation All antibodies were validated by the manufacturer. Please refer to the manufacturers’ websites with the catalog information 
listed above. 


Eukaryotic cell lines 


Policy information about cell lines 


Cell line source(s) The following cell lines were purchased from American Type Culture Collection (ATCC): MIA PaCa-2, NCI-H1373, NCI-H2030, 
NCI-H2122, SW1463, SW1573, UM-UC-3, Calu-1, NCI-H1792, NCI-H23, NCI-H358, SW837, AsPC-1, A-427, LS 174T, SW480, 
A549, NCI-H1355, HCC-827, COLO-205. KM12 and NCI-H3122 were obtained from the Amgen internal cell bank, originally 
sourced from the National Cancer Institute. 
MIA PaCa-2 T2 and SW480-1AC cells were generated by passaging MIA PaCa-2 and SW480 cells, respectively, in mice. 


CT-26 KRAS p.G12C cells were generated from the murine CT-26 colorectal line (ATCC) using CRISPR technology to replace 
both KRAS p.G12D alleles with p.G12C (ThermoFisher Scientific). 


Authentication Cell lines were authenticated by short tandem repeat (STR) profiling or were used immediately after purchase from ATCC. 
Mycoplasma contamination All cell lines used for in vivo studies were confirmed to be negative for mycoplasma contamination. 


Commonly misidentified lines No commonly misidentified cell lines were used. 
(See ICLAC register) 
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Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals BALB/c or athymic nude mice, all female, all 6-12 weeks of age. 


Wild animals Studies did not involve wild animals. 
Field-collected samples Studies did not involve samples collected in the field. 


Ethics oversight All animal experimental protocols were approved by the Amgen Animal Care and Use Committee and were conducted in 
accordance with the guidelines set by the Association for Assessment and Accreditation of Laboratory Animal Care. 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 


Human research participants 


Policy information about studies involving human research participants 


Population characteristics See clinicaltrials.gov NCTO3600883. 
Key inclusion criteria: age >18; documented locally-advanced or metastatic KRASG12C; measurable or evaluable disease; ECOG 
<2; life expectancy >3 months (mo). Key exclusion criteria: active brain metastases; myocardial infarction within 6 mo. 
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Recruitment Patients were recruited at clinical study sites based on the presence of the KRAS p.G12C mutation in their tumor by standard 
genotype testing. 


Ethics oversight Clinical trial NCTO03600883 was conducted in compliance with all relevant ethical regulations. The protocol was approved by the 
institutional review boards (IRB)/independent ethics committees (IEC) of all clinical study sites. 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 


Clinical data 


Policy information about clinical studies 
All manuscripts should comply with the ICMJE guidelines for publication of clinical research and a completed CONSORT checklist must be included with all submissions. 


Clinical trial registration NCT03600883 
Study protocol Study is ongoing; clinical trial information is available on clinicaltrials.gov 
Data collection This multicenter, open-label, 1st in human, phase 1 study (NCTO3600883) evaluates safety, tolerability, PK/pharmacodynamics 


(PK/PD), and efficacy of AMG 510 in patients (pts) with KRASG12C advanced solid tumors. Primary endpoint: safety [eg, adverse 
events (AEs); dose limiting toxicities (DLT)]; key secondary endpoints: PK, ORR (overall response rate)[assessed every 6 weeks 
(wks)] and PFS (progression free survival). Key inclusion criteria: age >18; documented locally-advanced or metastatic KRASG12C; 
measurable or evaluable disease; ECOG <2; life expectancy >3 months (mo). Key exclusion criteria: active brain metastases; 
myocardial infarction within 6 mo. Sequential dose escalation cohorts are enrolled to evaluate safety, tolerability, PK/PD and to 
find the maximum tolerated dose (MTD). After identifying the MTD, 60 pts with advanced KRASG12C STs can enroll. Daily oral 
AMG 510 is given until disease progression (PD), intolerance, or consent withdrawal. 


Clinical data presented in this manuscript was collected at participating clinical sites from September 2018 through June 2019. 


Outcomes The endpoints described in this manuscript were based on RECIST 1.1 criteria for clinical responses. 


Flow Cytometry 
Plots 


Confirm that: 


The axis labels state the marker and fluorochrome used (e.g. CD4-FITC). 


The axis scales are clearly visible. Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers). 


All plots are contour plots with outliers or pseudocolor plots. 


A numerical value for number of cells or percentage (with statistics) is provided. 


Methodology 


Sample preparation For in vitro studies, cells were non-enzymatically detached from the wells, washed with staining buffer (PBS/0.5% BSA), and then 
incubated with PE-conjugated H-2Dd, H-2Kd, or H-2Ld antibodies (BioLegend) for 30 minutes on ice. After washing, cells were 
resuspended in staining buffer containing SYTOX Blue Dead Cell Stain (Life Technologies), and then analyzed by flow cytometry. 
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For co-cultures, CellTrace Violet-labeled CD8+ T cells from spleens of OT-I transgenic mice were combined with bone marrow- 
derived dendritic cells (BMDCs) in 96-well plates with or without further AMG 510 or MEKi treatment. Co-cultures were 
incubated for three days at 37°C. Cell division was assessed by flow cytometry by measuring CTV dilution in TCRB+ CD8a+ cells. 


For in vivo studies, tumors were harvested, weighed, minced, and placed in Liberase TL (0.2 mg/ml; Roche) and DNase | (20 yg/ 
ml; Ambion). Tumor cell suspensions were then homogenized using a gentle MACS Dissociator (Miltenyi Biotech) and incubated 


at 37°C for 15 minutes on a MACSmix Tube Rotator (Miltenyi Biotech). Cells were then treated with 0.02% EDTA (Sigma) and 
heat-inactivated FBS (ThermoFisher Scientific) and filtered to remove clumps. After centrifugation, the cell pellets were 
resuspended in LIVE/DEAD Fixable Blue Dead Cell Stain (ThermoFisher Scientific) for 30 minutes. Cell surface staining was 
performed with the indicated antibodies (see Antibodies section above) before fixation and permeabilization of the cells 
(Intracellular Fixation & Permeabilization Buffer Set, eBiosciences) for intracellular staining. CountBright™ Absolute Counting 
Beads (ThermoFisher Scientific) were added to each sample before analysis on an LSR II flow cytometer (BD Biosciences). All 
analyses were performed with FlowJo software v10 (FlowJo). Absolute cell counts were determined by normalizing cell numbers 
to beads recorded, divided by the volume of tumor aliquot analyzed and the mass of the tumor. 


Instrument In vitro samples were run on a BD LSRFortessa. In vivo samples were run on a LSR II flow cytometer (BD Biosciences). 
Software All analyses were performed with either BD FACSDiva or FlowJo software v10. 


Cell population abundance — Des« 


Gating strategy For in vitro experiments, FSC-H/FSC-A gate was used to identify single cells and eliminate doublets from the analysis. FSC/SSC 
gate (P1) was used to gate on the population of CT-26 KRAS p.G12C cells. Cells from P1 were displayed on a histogram and 
SYTOX Blue negative cells were gated on to identify live cells. The mean fluorescent intensity (MFI) of MHC class | antigen 
expression was measured on these live cells. 
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For co-culture experiments, lymphocytes were first gated using FSC/SCC. Live cells were then gated using 7AAD viability dye, 
followed by exclusion of doublets using SSC-A/SSC-H. CD8+ T cells were then gated using fluorescently labeled antibodies. 
Finally, CellTrace Violet dye incorporation was assessed on the CD8+T cells. 


For in vivo experiments, cells were gated first in intact cells using FSC/SCC. Cells were then gated on live cells using the viability 
dye, followed by cell type-specific gating using fluorescently labeled antibodies. 


Tick this box to confirm that a figure exemplifying the gating strategy is provided in the Supplementary Information. 
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The human gastrointestinal tract consists of a dense and diverse microbial 
community, the composition of which is intimately linked to health. Extrinsic 
factors such as diet and host immunity are insufficient to explain the constituents of 
this community, and direct interactions between co-resident microorganisms have 
been implicated as important drivers of microbiome composition. The genomes of 
bacteria derived from the gut microbiome contain several pathways that mediate 
contact-dependent interbacterial antagonism’ *. Many members of the Gram- 
negative order Bacteroidales encode the type VI secretion system (T6SS), which 
facilitates the delivery of toxic effector proteins into adjacent cells*”. Here we report 
the occurrence of acquired interbacterial defence (AID) gene clusters in 
Bacteroidales species that reside within the human gut microbiome. These clusters 
encode arrays of immunity genes that protect against T6SS-mediated intra- and 
inter-species bacterial antagonism. Moreover, the clusters reside on mobile 
elements, and we show that their transfer is sufficient to confer resistance to toxins 
in vitro and in gnotobiotic mice. Finally, we identify and validate the protective 
capability of arecombinase-associated AID subtype (rAID-1) that is present broadly 
in Bacteroidales genomes. These rAID-1 gene clusters have a structure suggestive of 
active gene acquisition and include predicted immunity factors of toxins derived 
from diverse organisms. Our data suggest that neutralization of contact-dependent 


interbacterial antagonism by AID systems helps to shape human gut microbiome 


ecology. 


Polymicrobial environments contain a plethora of biotic and abiotic 
threats to their inhabitants. Bacterial survival in these settings neces- 
sitates elaborate defensive mechanisms. Some of these are basal and 
protect against a wide range of threats, whereas others, suchas CRISPR- 
Cas, represent adaptations that are unique to the specific threats 
encountered by a bacterial lineage®’. In densely colonized habitats such 
as the mammalian gut, overcoming contact-dependent interbacterial 
antagonism is a major hurdle to survival. The T6SS is a pathway widely 
used by gut bacteria to mediate the delivery of toxic effector proteins 
into neighbouring cells®*”°. Although kin cells are innately resistant to 
these effectors via cognate immunity proteins, whether non-self cells 
inthe gut can escape intoxication is unknown. Notably, several recent 
studies have reported that bacteria froma range of environments can 
encode T6S immunity genes that lack an accompanying cognate effec- 
tor gene”, It has been suggested that these genes are involved in 
interbacterial defence, but they have yet to be functionally investigated 
ina physiological context. 


B. fragilis T6S immunity in the human gut microbiome 


To identify potential mechanisms of defence against T6S-delivered 
interbacterial effectors in the human gut, we mined a large collec- 
tion of shotgun metagenomic samples (n = 553) from several studies 
of the human gut microbiome for sequences homologous to known 
immunity genes (Supplementary Table 1). We first focused our 
efforts on Bacteroides fragilis, which has a well-described and diverse 
repertoire of effector and cognate immunity genes**” (Supplemen- 
tary Table 2). As expected for genes predicted to reside within the 
B. fragilis genome, sequences mapping to these immunity loci were 
detected at a similar abundance to that of B. fragilis species-specific 
marker genes in many microbiome samples (Fig. 1a, grey; Supple- 
mentary Table 1). However, in a second subset of samples, immunity 
genes were detected at a significantly higher (more than ten times) 
abundance than expected given the abundance of B. fragilis (Fig. 1a, 
blue). Finally, we identified a third subset of samples in which the 
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Fig. 1|T6SS orphanimmunity genes are found inhuman gut microbiomes. 
a, b, Comparison of abundance of B. fragilis-specific T6SS immunity genes (a) 
or effector genes (b) with B. fragilis marker genes in adult microbiome samples 
derived from the Human Microbiome Project (HMP) and METAgenomics of the 
Human Intestinal Tract (MetaHIT) studies (Supplementary Table 1). Abundance 
values denote the number of reads mapped to the gene, normalized by gene 
length and total number of reads in the sample. Abundances are calculated as 
the average abundance of all immunity, effector or B. fragilis species-specific 
marker genes. Samples with undetectable B. fragilis (green) and samples in 
which immunity gene abundance exceeds that of Bacteroides by over tenfold 
(blue) are highlighted. ND, not detected. 


sequences of immunity genes were detected in the absence of B. 
fragilis (Fig. 1a, green). These latter sequences include close homo- 
logues of 12 out of 14 unique immunity genes (i1-i14) encoded by B. 
fragilis (Extended Data Fig. 1a). In contrast to the pattern observed 
for immunity genes, we found no samples in which the abundance 
of corresponding cognate effector genes markedly exceeded that of 
B. fragilis (Fig. 1b, Supplementary Table 1). 


Orphan immunity genes are encoded by many species 


The detection of B. fragilis immunity gene homologues in samples in 
which we were unable to detect B. fragilis strongly suggests that these 
elements are encoded by other bacteria in the gut. Indeed, BLAST 
searches revealed the genomes of several Bacteroides spp. that con- 
tain B. fragilis T6S immunity gene homologues, including B. ovatus 
(i6, i7, 5 and i14), B. vulgatus (i8 and i13), B. helcogenes (i1, i9 andi10) and 
B.coprocola (i8 andi13). To determine whether these bacteria could also 
account for the presence of immunity genes in the human gut micro- 
biome, we assembled full-length predicted immunity genes from the 
metagenomic sequencing reads of individual microbiomes. We limited 
this assembly to homologues of i6—the most prevalent immunity gene 
detected in samples lacking B. fragilis (Extended Data Fig. 1a). Clustering 
of the recovered homologues showed that most i6 sequences distribute 
into three discrete clades that differ by several nucleotide substitutions 
(i6:cl, cll and clll) (Fig. 2a and Supplementary Table 3). A comparison 
of these immunity sequences to available bacterial genomes revealed 
aclade matching cognate immunity genes in B. fragilis (i6:cl). In addi- 
tion, we found ani6 sequence homologous to i6:clll in the genome of 
B. ovatus, which we previously found does not contain the cognate T6SS 
effector gene’®. 

To identify the species that encode these sequences in human gut 
microbiomes, we used a simplified linear model to identify Bacteroides 
spp., the abundance of which in microbiome samples best fits that of 
each immunity sequence clade. We found that i6:clll is best explained in 
gut metagenomes by B. ovatus (Fig. 2b, c), which suggests that although 
reference genomes of both B. fragilis and B. ovatus contain these 
sequences, it is most often contained by the latter in natural popula- 
tions. We could not confidently define a single species containing i6:cll 
by this method (Fig. 2d, e); therefore, we applied the same analysis pipe- 
line to an infant microbiome dataset for which matching stool samples 
were available’® (Extended Data Fig. 1b and Supplementary Table 4). 
Whole-genome sequencing of isolates identified B. xylanisolvens as 
a bacterium containing i6:cll in these samples. Notably, this species 
best fit the abundance of i6:cll genes in the large adult metagenomic 
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Fig. 2|T6SS orphanimmunity gene clusters are encoded by several species 
inthe human gut microbiome. a, Dendrogram depicting hierarchical 
clustering of orphan immunity gene i6 sequences extracted from genomes 
(n=15) and metagenomes (n= 32) derived from the indicated HMP (SRS) or 
MetaHIT (MH) samples. Sequence clades are denoted cI-IIl. Asterisks indicate 
strains shown infand g. b-d, Comparison of abundance of genes from the 
indicated immunity clades (colours as ina) with marker genes from B. ovatus (b) 
or B. xylanisolvens (d) in adult microbiome samples (Supplementary Table 1). 
c,e, Linear model error values for the six species best fitting i6:clII (c) and i6:cll 
(e) gene abundances calculated as in Fig. 1. f, g, Representative AID-1 (f) and 
AID-2 (g) gene clusters containing homologues of the indicated B. fragilis T6S 
immunity genes. The i6 gene of SRS056259 did not meet our sequence depth 
coverage requirements for inclusion inié6:clll. The B. xylanisolvens strain ing was 
sequenced asa part of this study (BioProject PRJNA484981). A region of 
difference between B. ovatus 3725 and B. fragilis 638R clusters is highlighted. 
Allstrain abbreviations are defined in Supplementary Table 3. 
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datasets, albeit by anarrow margin (Fig. 2d). On the basis of these obser- 
vations, we hypothesized that orphan immunity genes—encoded by B. 
fragilis and other species of Bacteroides—have an adaptive role in the 
gut by providing defence during intra- and inter-species antagonism. 


AID system immunity genes neutralize T6S toxins 

To gain insight into the function of orphan immunity genes, we 
examined their genomic context in available reference genomes and 
assembled sequence scaffolds from metagenomic data. We found that 
homologues of B. fragilis T6S immunity genes i6 and i7 are located 
together within discrete gene clusters, which we termed AID-1 (acquired 
interbacterial defence 1) systems, in several Bacteroides strains and in 
microbiome samples (Fig. 2f). Within the AID-1 gene cluster, we identi- 
fied distant homologues and pseudogenized remnants of additional 
B. fragilis immunity genes, including 4, iS, i11 and i14 (Fig. 2f and Sup- 
plementary Table 5). These findings prompted us to search for more 
distant homologues of B. fragilis T6S immunity genes. This revealed 
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Fig. 3 | Orphanimmunity genes are mobile and protect against T6S- 
delivered toxins. a, b, Outcomes of growth competition assays between the 
indicated strains containing AID-1 (a) or AID-2 (b) versus B. fragilis 9343. 
Relative recipient fitness was determined by calculating the ratio of final to 
initial colony-forming units (c.f.u.) and normalizing to corresponding 
experiments with T6S-inactive B. fragilis 9343 (AtssC). Dataare mean +s.d. of 
three technical replicates indicative of at least three biological replicates. 
*P<0.05,**P< 0.01, unpaired two-tailed t-test.c, Outcome of pairwise 
competition between the indicated 
B. fragilis strains in germ-free mice. Mice were colonized with B. fragilis 9343 
for one week and challenged with 638R (n=8 mice per group for each of two 


gene clusters containing orphan homologues of i1, i3, i8, i9, 110 and i13 
in diverse Bacteroides genomes and we also identified distant homo- 
logues of these genes in a metagenomic gene catalogue (Extended 
Data Fig. 2a, b). In B. xylanisolvens, we found that genes belonging 
to ié:cll are located in a unique, but analogous context adjacent toa 
homologue of i5 on an apparent transposable element” (Fig. 2g). We 
designated this sequence AID-2. 

We next defined the phenotypic implications of orphan immunity 
genes of Bacteroides spp. during interbacterial competition. B. fragilis 
9343 encodes the cognate effectors for i6 andi7, and previous data dem- 
onstrate that the corresponding toxins antagonize assorted Bacteroides 
spp. in vitro and in gnotobiotic mice’. We thus used this strain in growth 
competition assays against Bacteroides spp. bearing orphan immunity 
genes, derivative strains containing deletions of these genes, or geneti- 
cally complemented strains. These experiments showed that in both 
B. ovatus and B. fragilis, AID-1 system genes grant immunity against cor- 
responding T6S effectors (Fig. 3a, Extended Data Fig. 3a, b). The i6 and 
i7 genes of B. ovatus did not influence the outcome of its competition 
with B. fragilis 638R, which possesses an orthogonal effector reper- 
toire (Extended Data Fig. 3c). Finally, we also found that an i6:cll gene 
froma B. xylanisolvens AID-2 system gives this bacterium protection 
against e6 of B. fragilis 9343 (Fig. 3b). Together, these data show that 
the orphan immunity genes of several Bacteroides species—localized 
to AID systems—can confer protection against effectors delivered by 
the T6SS of B. fragilis. 

B. fragilis is typically found as a clonal population in the human gut 
microbiome, and recent studies suggest that this is in part owing to 
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independent experiments). For box plots, the middle line denotes the mean for 
each group; the box denotes the interquartile range; and the whiskers denote 
the minimum and maximum values. **P< 0.01, Mann-Whitney U-test for each 
time point. d, Schematic depicting a B. fragilis 638R ICE containing the AID-1 
cluster depicted in Fig. 2f.e, PCR analysis of an AID-1 transfer experiment. 
Schematic with primer locations provided in Extended Data Fig. 3f. Transfer data 
are representative of two independent biological replicates. f, Outcomes of 
growth competition assays between B. fragilis 43859 or a derivative AID-1ICE 
recipient and B. fragilis 9343. Data presentation and statistics are as inaandb. 
g, Results of pairwise competition between the indicated B. fragilis strains in 
germ-free mice (n=6 mice from two independent experiments). Statistics and 


active strain exclusion via the T6SS*"®. However, in colonization experi- 
ments in gnotobiotic mice, certain B. fragilis strain pairs inexplicably co- 
exist!°. We noted that one such pair corresponds to B. fragilis 9343 and 
B. fragilis 638R, the latter of which contains an AID-1 system containing 
homologues of i6 and i7. To determine whether our in vitro results with 
these strains extend to amore physiological setting, we measured the 
fitness contribution of the orphan immunity genes encoded by B. fra- 
gilis 638R after pre-colonization of germ-free mice with B. fragilis 9343 
(Fig. 3c, Extended Data Fig. 3d, e). Our results indicated that the cumula- 
tive protection afforded by orphani6 andi7 genes underlies the ability of 
B. fragilis 638R to persist during T6S-mediated antagonism in vivo. 


AID-1 transfer confers protection against T6S 


Notably, we found that AID-1 resides ona predicted mobile integrative 
and conjugative element (ICE), which provides a possible explana- 
tion for its distribution” (Fig. 3d). To test whether this element can 
be transferred between strains, we performed mobilization studies 
using B. fragilis 638R as a donor and B. fragilis 43859 as arecipient. An 
antibiotic-resistance marker was inserted within AID-1 to facilitate the 
detection of its transfer. With this tool, we readily detected AID-1trans- 
fer (Fig. 3e). This occurred at a frequency of approximately 5 x 10°, in 
line with previous quantification of ICE mobility in Bacteroides spp.”°. 
Next, we asked whether the transfer of AID-1 to B. fragilis 43859 is suffi- 
cient to confer resistance to T6S-mediated antagonism. In vitro growth 
competition assays against B. fragilis 9343 showed that AID-1 effectively 
neutralizes intoxication by e6 and e7 (Fig. 3f). The receipt of AID-1also 


granted notable protection to B. fragilis 43859 against T6S-mediated 
killing in germ-free mice pre-colonized with B. fragilis 9343 (Fig. 3g). 
Together, these findings indicate that the transfer of a mobile orphan 
immunity island to a naive Bacteroides strain is sufficient to provide 
defence against T6S effectors. 

Deciphering the contribution of individual gene products, or even 
whole pathways, to bacterial fitness in complex microbial communities 
is challenging. We reasoned that the identification of orphan immunity 
genes in human gut metagenomes, coupled with our ability to infer 
their organismal source, provided an opportunity to measure the effect 
of these defensive factors on competitiveness in the gut. To this end, 
we compared the abundance of B. ovatus strains with and without i6 
and i7 orphan immunity genes in human gut metagenome samples. 
We found that the average abundance of B. ovatus strains with orphan 
immunity genes significantly exceeds that of those without orphan 
genes (Extended Data Fig. 3g). One interpretation of this finding is that 
the acquisition of i6 andi7 allows B. ovatus to increase its niche; however, 
there are several potential caveats inherent to these correlative data 
that cannot be ruled out. For example, B. ovatus strains that containi6 
andi7 might be related and enriched for other fitness determinants that 
account for their abundance. 


Recombinase-associated AID systems are prevalent 


Given the benefit of orphan immunity genes against B. fragilis effectors, 
we hypothesized that this mechanism ofinhibiting interbacterial antago- 
nism should extend to effectors produced by other species. We previ- 
ously found that B. fragilis is antagonized by other Bacteroides spp. inthe 
human gut microbiome’®. In addition to the T6SS present exclusively in 
B. fragilis, this species and other Bacteroides species can possess other 
T6SSs, referred to as GA1 (genetic architecture 1) and GA2, witha distinct 
and non-overlapping repertoire of effector and immunity genes’. The 
effectors of these systems exhibit hallmarks of antibacterial toxins and 
we demonstrated their ability to mediate interbacterial antagonism’*® 
(Fig. 4, Extended Data Fig. 4). Therefore, we searched B. fragilis genomes 
for sequences that are homologous to the immunity genes correspond- 
ing to these systems. In 29 out of the 122 available B. fragilis genomes, 
we identified apparent orphan homologues of these immunity genes 
grouped within gene clusters (Fig. 4a). Although analogous to the AID-1 
and AID-2 systems, these clusters have several unique characteristics 
including skewed GC content, conservation of a gene encoding a pre- 
dicted XerD-family tyrosine recombinase, and repetitive intergenic 
sequences” (Extended Data Fig. 5a—c). These often-large gene clusters, 
hereafter referred to as recombinase-associated AID (rAID-1) systems, 
can exceed 16 kb and contain up to 31 genes with varying degrees of 
homologyto T6S immunity genes and predicted immunity genes associ- 
ated with other interbacterial antagonism pathways? (Fig. 4a, Extended 
Data Fig. 5d, e, Supplementary Table 6). Using the shared characteristics 
of B. fragilis rAID-1 systems, we searched for related gene clusters across 
sequenced Bacteroidales genomes. We found that more than half of 
sequenced bacteria belonging to this order possess a rAID-1 system 
(226 out of 423) (Fig. 4b, Supplementary Tables 6, 7). Insummary, these 
gene clusters contain 579 unique genes, which encompass homologues 
of 25 Bacteroidales T6S immunity genes. 

The prevalence of rAID-1 genes in Bacteroidales genomes suggested 
that these elements may be common in the human gut microbiome. 
To investigate this, we searched metagenomic data for sequences that 
map to Bacteroidales T6S orphan immunity genes found within rAID-1 
systems. Notably, we found one or more rAID-limmunity genes in 551 
out of 553 samples using a 97% sequence identity threshold to map 
reads (Supplementary Table 8). These rAID-1 immunity genes diverge 
considerably from corresponding cognate immunity genes (corre- 
sponding to 32-91% amino acid identity), which suggests that the latter 
are an unlikely source of significant false positives in this analysis. We 
also searched the same samples for rAID-1-associated recombinase 
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Fig. 4| rAID systems encode toxin-neutralizing immunity genes and are 
prevalent in human gut microbiomes.a, b, rAID-1 clusters from the indicated 
B. fragilis (a) or Bacteroidales (b) species. rAID-1 genes were assigned to 
functional immunity classes (indicated by gene colouring) via profile HMM 
scans and BLAST against a curated database of Bacteroidales T6SS immunity 
genes?*, Coloured circles indicate taxonomic association of the top non-rAID-1 
homologue. Homology (70% amino acid identity) between gene 08050 of the 
B. fragilis 9343 rAID-1 cluster and a T6S cognate immunity gene from B. fragilis 
YCH46 (2851) is indicated. c, Viable £. colicells recovered from cultures 
expressing the indicated proteins (see Supplementary Table 10 for locus tags). 
Data are mean +s.d. of three technical replicates representative of three 
independent biological replicates. d, Outcomes of growth competition assays 
between the indicated Bacteroides strains (n= 3 biologically independent 
samples). The relevant rAID-1 gene of B. fragilis 9343 and its corresponding 
effector within B. fragilis YCH46 are depicted ina. **P<0.01, unpaired two-tailed 
t-test. Dataaremeants.d. 


sequences. Although recombinase genes are widely distributed across 
bacteria, close homologues (more than 50% amino acid identity) of 
those found associated with rAID-1 systems are restricted to this con- 
text and only found in Bacteroidales genomes. Consistent with this, 
we found that the abundance of rAID-1 recombinase genes correlates 
strongly with the genus Bacteroides (Extended Data Fig. 5f). 


Divergent rAID-1 immunity genes protect against T6S 

Orphan immunity genes encoded within rAID-1 clusters diverge more 
from cognate immunity than do those within AID-systems. Thus, we 
sought to experimentally validate the ability of rAID-1 immunity genes 
to protect bacteria from intoxication. Because most bacteria contain- 
ing rAID-1 systems have limited genetic tools, we used Escherichia coli 
to identify three Bacteroidales T6S effector genes that intoxicate cells 
inamanner that is neutralized by cognate immunity. In each case, we 
found that co-expression of these effector genes with corresponding 
rAID-1-associated orphan immunity genes, but not mismatched orphan 
immunity genes, restored E. coli growth (Fig. 4c). Both of the genes from 
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oneeffector-orphanimmunity pair that we validated derive from geneti- 
cally tractable strains: B. fragilis YCH46 (effector, 2850) and B. fragilis 
9343 (orphan immunity, 08050). In vitro growth competition experi- 
ments with these strains, and mutant and genetically complemented 
derivatives, showed that an endogenous rAID-1 orphan immunity gene 
of B. fragilis 9343 can neutralize a T6S-delivered toxin (Fig. 4d). 

The orphanimmunity systems that we defined consist of many genes 
and their expression could incur a substantial metabolic burden. Asa 
first step towards understanding the regulation of AID systems, we per- 
formed quantitative PCR with reverse transcription (qRT-PCR) analysis 
to compare the expression of the systems in the presence and absence 
ofacompetitor strain. These studies provided evidence that transcrip- 
tion of both systems is induced by co-cultivation with a competitor 
strain (Extended Data Fig. 5g). We also examined meta-transcriptomic 
data for evidence of AID expression. Owing to a paucity of such data 
available for samples definitively containing AID-1 and AID-2, we could 
not systematically quantify the expression of these systems. However, 
using conservative criteria for defining rAID-1-associated genes in meta- 
transcriptomic data, we found evidence for the expression of this system 
in every sample derived froma large study (n=156)” (Supplementary 
Table 9). In some samples, suchas those with high levels of Bacteroides, 
rAID-1 genes accounted for nearly 1in 10,000 of all meta-transcriptomic 
reads. Together with our functional characterization of AID systems, 
these findings suggest that acquisition and maintenance of consoli- 
dated orphanimmunity determinants is acommon mechanism by which 
Bacteroidales defend against interbacterial antagonism in the human 
gut microbiome. 


Discussion 


Mounting evidence suggests that competitive interactions between 
bacteria predominate in many environments”. This evolutionary 
pressure has undoubtedly led to the wide dissemination of idiosyn- 
cratically orphaned immunity genes that are predicted to provide 
resistance to diverse antagonistic pathways”"***°, Modelling stud- 
ies predict that interbacterial antagonism is a crucial contributor 
to the maintenance of a stable gut community”. Our findings reveal 
that a corollary of the pervasiveness of antagonistic mechanisms 
is strong selective pressure for genes that can provide protection 
against attack, establishing a molecular arms race that has led to the 
diversification and expansion of T6S effectors. Deciphering the link 
between orphan immunity genes and the bacteria harbouring the 
cognate effectors may help to shed light on the physical connectivity 
of bacteria in the gut microbiome. 

Itis now appreciated that phage defence mechanisms, including the 
adaptive system CRISPR, are crucial for bacteria to cope with the omni- 
present threat and deleterious outcome of phage infection’. However, 
the ubiquity of interbacterial antagonistic systems suggests that in most 
habitats, bacteria are equally, or perhaps more likely to be subject to 
attack and potential cell death via the action of other bacteria”. Our 
characterization of AID systems encoded by prevalent members of the 
human gut microbiota seems to reconcile these observations and dem- 
onstrate that the neutralization of contact-dependent interbacterial 
antagonism can be acritical mechanism for survival in polymicrobial 
environments. In addition, it suggests that analogous to the immune 
system of vertebrates, that of bacteria includes arms specialized in 
viral or bacterial defence. 
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Methods 


Microbiome data 

Metagenomic data from healthy adults were obtained from several large- 
scale sequencing projects. We specifically obtained 147 samples from 
the Human Microbiome Project (HMP) 1.0, 100 samples from HMP 1.2, 
and 99 and 207 samples from two different MetaHIT datasets*>”””5, We 
further obtained paired metagenomic-metatranscriptomic data from 
a study of 156 individuals”. Finally, we obtained a database of genes 
identified from 1,267 assembled metagenomes as part of the integrated 
gene catalogue (IGC)””. 


Analysis of gene and species abundances in microbiome samples 
We previously compiled alist of T6SS immunity and effector genes®. We 
also compiled alist of species-specific marker genes for all Bacteroides 
species obtained from MetaPhlAn 2.0”. To determine the abundance of 
agivenimmunity, effector or marker gene in each metagenomic sample, 
single-end metagenomic reads were aligned to gene sequences using 
bowtie2, allowing for one mismatchin the seed™. We counted thenum- 
ber of reads that aligned to each such gene with at least 80% nucleotide 
identity (to encompass divergent orphan immunity gene sequences) 
and minimum mapping quality of 20. The abundance of a gene was 
calculated as the number of reads aligned to this gene, normalized by 
the gene length and by the library size. For each species, the average 
gene level abundance of all species-specific marker genes was used to 
assess the species abundance. For the total Bacteroides abundance, we 
used the sum of all species-specific marker genes in the genus. Samples 
were only included in an analysis if they had at least 10 reads mapping to 
the T6SS genes in question (effectors, immunity or recombinases). On 
the basis of the abundance of GA3 immunity genes and B. fragilis, we 
split samples into those in which B. fragilis was not detectable, those 
in which the immunity gene had more than ten times the abundance 
of the B. fragilis marker gene, and those in which such discrepancy 
between the abundance of immunity genes and that of B. fragilis was 
not observed. Meta-transcriptomics data were processed similarly to 
metagenomics data, except that abundance values were converted toa 
reads per kilobase of transcript per million mapped reads (RPKM) value 
for familiarity with canonical RNA sequencing analysis. 


Orphan immunity phylogenetic analysis 

Filtered reads derived from human shotgun microbiome datasets 
were aligned using bowtie2 as described above and subsequently 
converted to a pileup using samtools with parameters --excl-flags 
UNMAP,QCFAIL,DUP -A -qO -CO -B*”*. A sequence corresponding to 
the most abundant version of the immunity gene in the sample was 
reconstructed from that pileup as follows. First, 50 bases from the start 
and end were trimmed due toa propensity for low coverage. Second, at 
all sites with at least 10 times coverage the base was set to the major allele. 
Sites with less than ten times coverage were assigned an ambiguous 
base. Finally, we only kept the reconstructed sequence in metagenomic 
samples where at least 90% of the sequence had more than ten times 
coverage. The number of single nucleotide polymorphisms between 
all immunity sequences, both from metagenomic samples and from 
Bacteroides genomes, was calculated and used to populate a distance 
matrix. Because obtained distances were small (for example, a single 
base difference), we used hierarchical clustering (with complete linkage), 
rather than standard phylogenetic reconstruction methods, to visualize 
the relatedness between different sequences. Sequence clades defined 
by hierarchical clustering are denoted (cI-IlII), as discussed in the text. 


Assigning orphan immunity sequences to bacterial species 

We aimed toidentify the species most likely to encodethe immunity gene 
ineachcluster of identical sequences reconstructed from metagenomes. 
Only clusters with at least three sequences were used, to ensure statisti- 
cal confidence. The abundance of each species was assessed based on 


species-specific marker genes as described above. We specifically used a 
simple linear model that assumed that only a single species encodes the 
immunity gene. We further assumed a one-to-one relationship between 
species marker gene abundance and orphanimmunity gene abundance, 
and accordingly fixed the intercept at zero and allowed asingle species 
withaslope of one. The fit of the model for each species was calculated as 
the mean squareerror over allsamples. The most likely species to encode 
theimmunity gene was determined by the minimum mean squarederror. 


Assembly of orphan immunity sequences from metagenomes 

Paired-end metagenomic sequencing data were assembled using Soap- 
DeNovo2 witha kmer length of 63 and an average insert size of 200”. 
BLAST was used to identify the contig that contained the orphan immu- 
nity gene, and GeneMarkS was used to predict protein-coding genes™. 


Bacterial culture conditions 

Anaerobic culturing procedures were performed either in an anaerobic 
chamber (Coy Laboratory Products) filled with 70% N,, 20% CO, and 
10% H,, or in Becton Dickson BBL EZ GasPak chambers. E. coliEC100D 
Apir and S17-1A pir strains were grown aerobically at 37 °C onlysogeny 
broth (LB) agar. Unless otherwise noted, Bacteroides strains were cul- 
tured under anaerobic conditions on brain heart infusion (BHI) agar (Bec- 
ton Dickinson) supplemented with 50 pg mI haemin and 10% sodium 
bicarbonate (BHIS)®. Antibiotics and chemicals were added to media 
as needed at the following concentrations: trimethoprim 50 pg mI", 
carbenicillin150 pg mI“, gentamicin 15 pg mI (E. coli), gentamicin 60 
pg mI"! (Bacteroides), erythromycin 12.5 pg mI“, tetracycline 6 pg mI, 
chloramphenicol 12 pg mI“, floxuridine (FUdR) 200 pg mI. 


Genetic techniques 

Standard molecular procedures were used for the creation, maintenance 
and £. colitransformation of plasmids. All primers used in this study were 
synthesized by Integrated DNA Technologies (IDT). Phusion polymerase, 
restriction enzymes, T4 DNA ligase, and Gibson Assembly Reagent were 
obtained from New England Biolabs (NEB). A comprehensive list of prim- 
ers, plasmids, and strains are provided (Supplementary Table 10). Dele- 
tion of the gene encoding thymidine kinase in B. fragilis, B. ovatus and 
B.xylanisolvens strains was performed by cloning respective genomic 
flanking regions into the vector pKNOCKas previously described*. In 
brief, PKNOCK-tdk plasmids were mobilized into Bacteroides strains 
via overnight aerobic mating with F. coli. Integrants were isolated by 
plating on selective media, were passaged once without antibiotics to 
allow for plasmid recombination, and plated for counter selection on 
FUdR. Recovered single colonies were patched onto selective media to 
ensure loss of pKNOCK, and disruption of tdk was confirmed by PCR. 
Subsequent deletion of orphan immunity genes was performed in Atdk 
strains via a similar counter selection strategy, except employing the 
suicide plasmid pExchange in place of pKNOCK®. Genomic deletions 
were confirmed by PCR. Gene complementation was performed by 
cloning genes into pNBU2-erm_us1311 for constitutive expression”. 


Isolation of Bacteroides strains from faecal samples 

Faecal samples from healthy infants used for strain isolation were col- 
lected as part of a previous study approved by the Seattle Children’s 
Hospital Institutional Review Board’®*’, Frozen stool samples stored at 
-80 °C were manually homogenized, serially diluted in tryptone yeast 
glucose (TYG) broth, and plated under anaerobic conditions on Bac- 
teroides bile esculin (BBE) agar plates (Oxyrase). Single colonies that 
exhibited esculin hydrolysis as indicated by the production of black 
pigment on BBE agar were sub-cultured in TYG broth with the addition 
of 60 pg ml“ gentamicin until stationary phase and then were frozen at 
-80 °C after the addition of sterile glycerol to 20% final concentration. 
Single colonies isolated from these stocks were subsequently screened 
by PCR with primers targeting the orphan i6 gene as assembled from 
metagenomic short read sequence data’®. 


Article 


Genome sequencing 

Genomic DNA used for Illumina sequencing was prepared by collecting 
Bacteroides strains grown overnight on BHIS blood agar plates. Cells 
resuspended from plates were washed in PBS before DNA extraction 
with the Qiagen DNeasy Blood and Tissue Kit. Sequencing was per- 
formed onan Illumina MiSeq using the V3 Reagent kit at the Northwest 
Genomics Center sequencing facility at the University of Washington. 
AID clusters often appear in highly repetitive genomic contexts (for 
example, mobile elements) and are often split into multiple scaffolds 
inreference genomes. To compensate for this, we also performed long- 
read sequencing via PacBio ona subset of genomes. To this end, high 
molecular mass DNA was extracted using the Qiagen Genomic-tip Kit 
and sequenced by SNPsaurus using a PacBio Sequel. Hybrid long read 
and short read assemblies were conducted using Unicycler”’. Species 
identification was performed by blast searches with species-specific 
marker genes”?. 


Interbacterial competition assays 

Bacteroidales strains were grown on BHIS blood agar plates overnight 
at 37 °C. Bacteria were resuspended from plates in BHIS broth and the 
optical density (OD) of each strain was adjusted to a10:1 B. fragilis NCTC 
9343 to competitor ratio (OD,o, 6.0 to 0.6) for competitions involving 
B. xylanisolvens and B. ovatus, or 1:1 ratio for competitions involving 
B. fragilis 638R (OD ¢o9 6.0). Equal volumes of each strain at the adjusted 
OD were mixed and 5 ul of bacterial mixtures were spotted onto pre- 
dried BHIS blood agar plates, in triplicate spots. Competitions were 
allowed to proceed for 20-24 h at 37 °C under anaerobic conditions 
before spots were collected into BHIS broth. Competition outcomes 
were quantified in one of two ways: (1) by serial dilution for enumeration 
of c.f.u. after plating on BHIS-selective plates containing either eryth- 
romycin or tetracycline; or (2) purification of total genomic DNA using 
the Qiagen DNeasy Blood and Tissue Kit and subsequent quantification 
by qPCR using strain-specific primers (see Supplementary Table 10). 
For antibiotic selection, B. fragilis 9343 was marked with erythromycin 
resistance by integration of pNBU2-ermat the attl site”. Other strains 
were either naturally tetracycline resistant, or marked by integration 
of pNBU2-tet-BCO1. Strains with insertions of pNBU2 were selected 
for matching integration sites by PCR with primers flanking att loci*®. 
Interbacterial competitions between strains of B. fragilis occasionally 
exhibited T6SS-independent phenotypes that were dependent onthe 
initial starting ratio of the strains used". 


Interbacterial mobile element transfer assays 

Allelic exchange was used to engineer a high-expression chlorampheni- 
col resistance cassette onto the AID-1 system of B. fragilis 638R, replac- 
ing BF638R_2056-2058”. Chloramphenicol-resistant B. fragilis 638R 
cells were mixed on BHIS blood agar plates with erythromycin-resistant 
B. fragilis ATCC 43859 cells at a 1:1 ratio (OD, 6.0). After overnight co- 
culture, bacterial mixtures were collected and plated on BHIS plates 
containing either erythromycin alone (to quantify c.f.u. of total ATCC 
cells), or erythromycin and chloramphenicol (to quantify c.f.u. of AID-1 
recipient ATCC cells). Double-resistant colonies were screened individu- 
ally by PCRto confirm strain identity, the presence of the AID-1system, 
and the genomic integration site at atRNAS gene (see Supplementary 
Table 10 for primers used). 


Gnotobiotic mice studies 

Germ-free 6-12-week-old female Swiss Webster mice from several litters 
were randomized, housed simultaneously in pairs in single Techniplast 
cages with a 12-h light/dark cycle, and fed a standard laboratory diet 
(Laboratory Autoclavable Rodent Diet 5010, LabDiet), in accordance 
with guidelines approved by the University of Washington Institutional 
Animal Care and Use Committee. Blinding was not performed, and no 
statistical methods were used to determine sample size. Reasonable 


numbers of animals were used considering limitations of housing and 
maintenance under gnotobiotic conditions. Bacteroides fragilis strains 
were introduced into mice via oral gavage of 10° c.f.u. suspended in 0.2 
ml of sterile PBS with 20% glycerol. Challenge with B. fragilis 638R or 
B. fragilis ATCC strains occurred 7 days after pre-colonization with 
B. fragilis 9343 strains. Colonization levels by each strain in each mouse 
were tracked by collection of faecal pellets over a period of 4 weeks, plat- 
ing onselective BHIS agar plates (B. fragilis 9343 on BHIS plus erythromy- 
cin; B. fragilis 638R and ATCC on BHIS plus tetracycline), and subsequent 
absolute quantification of c.f.u. by normalization of each sample to 
the initial pellet weight. Differences in the strain ratio of c.f.u between 
groups at each time point was assessed using Mann-Whitney U-tests. 
Non-parametric tests were used following Shapiro-Wilk analysis for 
normality of data at each time point. Mice were confirmed to be sterile 
before colonization by qPCR with primers targeting the 165 rRNA gene 
and free of non-Bacteroides contamination by plating faecal pellets on 
non-selective LB and BHIS plates incubated under either anaerobic or 
aerobic conditions”. 


Bioinformatic analysis of rAID-1 clusters 

The amino acid sequence of the B. fragilis NCTC 9343 polyimmunity- 
associated XerD-like tyrosine recombinase (BF9343_RSO8045) was used 
as a query against a custom database of 423 Bacteroidales genomes 
downloaded from GenBank. rAlD clusters in Bacteroidales genomes were 
identified based on the following criteria: (i) presence of a5’ XerD-like 
tyrosine recombinase gene encoding a protein with amino acid identity 
exceeding 44% (corresponding to an £ value of 1x 107); (ii) two or more 
co-directionally oriented downstream genes that possessed (iii) aGC 
content of 41% or lower. The end of the gene cluster was defined as the 
stop codon of the last co-directionally oriented gene inthe cluster with 
similar GC content. To identify homologues of genes within rAID- clus- 
ters, open-reading frames (ORFs) within the clusters were translated 
and used as tblastn queries against the NCBI non-redundant nucleotide 
database. Top hits from these searches were often genes in other rAID 
clusters; therefore, these hits were discarded. The top non-rAID hit from 
tblastn searches with an E-value threshold of 1 x 10-*° was selected as 
the closest homologue. rAID cluster genes were assigned to interbac- 
terial immunity gene families viahmm scans with profiles previously 
described? with an E-value cut-off of 1x 10°. rAID cluster genes were 
also compared via tblastn with 46 Bacteroidales T6SS immunity genes 
from subtypes 1-3°* with an E-value cut-off of 1x 10. The percentage 
amino acid identity with homologues was assessed if sequences could 
bealigned across more than 80% of their length. Motif enrichment analy- 
sis was performed on non-coding sequences within a subset of rAID-1 
clusters (14 sequences immediately 3’ of the recombinase stop codon, 
and 86 intergenic sequences between rAID-1 ORFs), using MEME Suite 
5.0.2 and default settings**. 


Heterologous expression of Bacteroides toxin and immunity 
genes 

To assess the ability of cognate immunity or orphan immunity to neu- 
tralize the toxicity of a Bacteroidales T6SS effector, genes were cloned 
into F. coliexpression vectors pScrhab2-V (effectors) and pPSV39-CV 
(immunity). Immunity genes were fused with the P. aeruginosa ribo- 
some binding site from hcp! during the cloning process*. All cloning 
steps for effector genes involved growth of £. colion media containing 
0.1% glucose to ensure repression of expression. £. coliDH5a cells were 
co-transformed with pSchraB2 and pPSV39 plasmids bearing genes 
of interest. Overnight cultures were then grown from single co-trans- 
formed colonies to stationary phase in LB broth containing 50 pg mI 
trimethoprim, 15 pg mI gentamycin, and glucose. Cells were collected 
from these cultures and washed to remove glucose before back-dilution 
to an ODgoo of 0.05 into LB broth containing 50 pg mI trimethoprim, 15 
pgml gentamycin, 0.05% rhamnose, and1mM isopropyl B-D-1-thiogalat 
opyranoside*°, Cultures were then grown for 8hshaking at 37 °C before 


plating to allow quantification of c.f.u. Experiments were performed 
with technical triplicates for each of at least two biological replicates. 


Gene expression analysis of AID-1and rAID-1 systems of B. ovatus 
3725 

To assess the level of expression of genes in the AID-1 and rAID-1systems 
of B. ovatus 3725, bacterial cells were first grown overnight on BHIS 
blood agar plates containing gentamycin. Cells were then resuspended 
in BHIS to an OD of 3.0 for B. ovatus monocultures, or to an OD of 
0.3 for mixed co-culture experiments with B. fragilis 9343 at tenfold 
excess (OD,¢o, of 3.0). Then, 5 pl volumes of bacterial mixtures were 
then spotted on BHIS blood agar plates. Plates were incubated at 37 °C 
for 2hunder anaerobic culture conditions before cells were collected 
directly in Buffer RLT plus B-mercaptoethanol (20 5-pl spots per condi- 
tion per replicate, Qiagen RNeasy Micro Kit). Two separate rounds of 
DNase treatment were performed (Qiagen RNase-free DNase, Thermo 
Fisher Scientific Turbo DNase-free kit). RNA samples were confirmed to 
be free of genomic DNA by PCR with primers targeting the Bacteroides 
16S rRNA gene. cDNA was generated using the High Capacity cDNA 
Reverse Transcription Kit (Applied Biosciences). Following synthesis, 
cDNA was diluted 1:10. qPCR (primers listed in Supplementary Table 10) 
was performed using SSO Universal SYBR Green Supermix (Bio-Rad) 
ona CFX96 machine (Bio-Rad). Genomic DNA was used to generated 
standard curves”. Differences in gene expression between samples 
were performed by normalization to the expression level of B. ovatus 
3725 gyrB. Primers targeting gyrB were designed to target regions of the 
genes that are highly polymorphic between B. fragilis and B. ovatus, and 
species-specificity for B. ovatus was confirmed by PCR using B. fragilis 
genomic DNA*. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


All data required to assess the conclusion of this research are available 
in the main text and Supplementary Information, have been depos- 
ited at the Sequence Read Archive (SRA) under BioProject accession 
PRJNA484981 or are available from https://github.com/borenstein- 
lab/T6SS). 


Code availability 


Python and R scripts used in this work are available for download 
(https://github.com/borenstein-lab/T6SS). 
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Extended Data Fig. 1| Prevalence of B. fragilis-specific orphan immunity 
genes in adult and infant microbiomes. a, Number of adult human gut 
microbiome samples in which the indicated immunity genes (1-14, GA3_i1-14 
from ref. *) canbe detected at an 80% nucleotide identity threshold and an 
abundance more than tenfold that of B. fragilis marker genes. Bars colouredasin 
Fig. 1a, and asterisks indicate immunity genes without orphan representation. 
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b, Comparison of abundance of B. fragilis-specific T6SS immunity genes with 
B. fragilis species-specific marker genes in infant microbiome samples” 
(Supplementary Table 4). Abundances are calculated as in Fig. la. Samples in 
whichimmunity gene abundance exceeds that of Bacteroides by over tenfold 
(blue) are highlighted. 
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Extended Data Fig. 2| Diversity and genomic context of orphanimmunity points indicate the amino acid identity of unique genes homologous to 
genes in human gut microbiomes and diverse Bacteroides species. indicated B. fragilis-specific T6SS cognate immunity genes identified through 
a, Representative AID-1 gene clusters containing homologues of the indicated BLAST analysis of the IGC” (n=88 genes, maximum F=1x 10°; minimum 


B. fragilis T6S immunity genes from the indicated reference genomes. b, Data percentage identity, 60%). 
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Extended Data Fig. 3 | Orphanimmunity genes specifically enhance the 
fitness of Bacteroides strains in vitro and in vivo. a, b, T6SS-dependent 
competitiveness of parental strains of B. ovatus 3725 and the indicated mutant 
and complemented derivatives during in vitro growth competition experiments 
with B. fragilis 9343. Relative recipient fitness was determined by calculating the 
ratio of final to initial c.f.u. and normalizing to the corresponding experiment 
with B. fragilis 9343 lacking tssC (T6S-inactive). Data are mean +s.d. of three 
independent biological replicates. *P< 0.01, unpaired two-tailed t-test. c, T6SS- 
dependent competitiveness of a parental strain of B. ovatus 3725 or astrain 
bearing in-frame deletions of indicated orphan immunity genes, duringin vitro 
growthcompetition experiments with an orthogonal effector-bearing B. fragilis 
638R parental strain or a derivative strain lacking tssC (T6S-inactive). Relative 
recipient fitness and statistics were calculated as ina and b.n=3 independent 
biological replicates. d,e, Recovery of B. fragilis 9343 (d) or 638R (e) andthe 


indicated orphanimmunity mutant derivative from pairwise competitions of 
the strains in germ-free mice. Lines indicate the mean at each time point (n=8 
mice per group for each of two independent experiments). Alternating time 
points of these data are included in ratio form in Fig. 3c. f, Schematic depicting 
genomic loci for the B. fragilis ATCC 43859 parental strain, the B. fragilis 638R 
AID-1 donor strain, the AID-1 system, and the ATCC 43859 AID-1 recipient. Grey 
shading indicates homology; red arrows indicate the position of PCR primers 
used to infer insertion of the AID-1 element at the tRNA insertion site. 

g, Abundance of B. ovatus insamples lacking detected orphanimmunity genes 
(-) and samples in which the indicated orphan immunity genes were assigned to 
B. ovatus (+). Abundances are calculated as in Fig. 1a. *P< 0.001, Wilcoxon rank- 
sum test.n=128 non-orphan samples, n=24 samples containing orphan 
immunity. For box plots, the middle line denotes the median; the box denotes 
the interquartile range (IQR); and the whiskers denote 1.5x the IQR. 
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Extended Data Fig. 4 | The GA2 system of Bacteroidales mediates 
interbacterial antagonism. Recovery of Bacteroides doreiDSM17855 cells 
lacking GA2_e14-i14 (BACDOR_RS22955-17020) from two-strain in vitro growth 
competition experiments with the indicated donor strains. n=3 technical 
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Extended Data Fig. 5 | rAID-1 systems include conserved and repetitive 
intergenic sequences and bear hallmarks of horizontal gene transfer. a, Left, 
motif enrichment analysis from the intergenic sequences immediately 3’ of the 
recombinase stop codon to the start codon of the first downstream open 
reading frame within 16 randomly selected rAID-1 gene clusters. This regionis 
highlighted in blue in three representative rAID-1 systems shown above. Right, 
motif enrichment analysis from all 86 intergenic sequences between the ORFs of 
six rAID-1clusters (B. fragilis NCTC 9343, B. cellulosilyticus WH2, B. ovatus 3725, 
Paraprevotella clara YIT 11840, Parabacteroides goldsteinii dnLKV18, and 
Parabacteroides gordonii MS-1)**. This region is highlighted in red in three 
representative rAID-1 systems shown above. b, Average G+ C nucleotide content 
of rAID-1-associated recombinase versus rAID-1 predicted ORFs (n= 226). 

***P< 0.0001, unpaired two-tailed t-test.c, Schematic depicting the G+ Cand 
A+T nucleotide content across a representative rAID-1 system from B. fragilis 
9343. d, Frequency distribution of gene number in rAID-1 clusters (n=1,247 
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genes in 226 clusters). Bin width is five genes. e, Composition of genes in rAID-1 
clusters (n=226 clusters) as determined by profile HMM scans and BLAST 
analysis against a curated database of Bacteroidales T6SS immunity genes". 

f, Comparison of the total abundances of rAID-1-associated predicted 
recombinases and the Bacteroides genus in adult microbiome samples derived 
fromthe HMPand MetaHIT studies (Supplementary Table 8). Abundance values 
are calculated as in Fig. 1; genus abundance corresponds to the sum ofall 
Bacteroides spp. (calculated individually as the average of species-specific 
marker gene abundances). g, Results of (RT-PCR analyses for the indicated 

B. ovatus 3725 genes belonging to AID-1(i6, MO88_1971) or AID-1 clusters (Rec, 
recombinase, MO88 _ 1401; orf1, MO088_1400) under conditions of growthin 
mono- or co-culture with B. fragilis 9343 for 2h. Dataare mean+s.d. of three 
independent biological replicates. *P< 0.05, **P< 0.01, Wilcoxon two-tailed 
sign-rank test. 
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Epigenetic aberrations are widespread in cancer, yet the underlying mechanisms and 
causality remain poorly understood’ ®. A subset of gastrointestinal stromal tumours 
(GISTs) lack canonical kinase mutations but instead have succinate dehydrogenase 
(SDH) deficiency and global DNA hyper-methylation*». Here, we associate this hyper- 
methylation with changes in genome topology that activate oncogenic programs. To 
investigate epigenetic alterations systematically, we mapped DNA methylation, CTCF 
insulators, enhancers, and chromosome topology in K/7-mutant, PDGFRA-mutant and 


SDH-deficient GISTs. Although these respective subtypes shared similar enhancer 
landscapes, we identified hundreds of putative insulators where DNA methylation 
replaced CTCF binding in SDH-deficient GISTs. We focused on a disrupted insulator 
that normally partitions a core GIST super-enhancer from the FGF4 oncogene. 
Recurrent loss of this insulator alters locus topology in SDH-deficient GISTs, allowing 
aberrant physical interaction between enhancer and oncogene. CRISPR-mediated 
excision of the corresponding CTCF motifs in an SDH-intact GIST model disrupted the 
boundary between enhancer and oncogene, and strongly upregulated FGF4 
expression. We also identified a second recurrent insulator loss event near the KIT 
oncogene, whichis also highly expressed across SDH-deficient GISTs. Finally, we 
established a patient-derived xenograft (PDX) from an SDH-deficient GIST that 
faithfully maintains the epigenetics of the parental tumour, including 
hypermethylation and insulator defects. This PDX model is highly sensitive to FGF 
receptor (FGFR) inhibition, and more so to combined FGFR and KIT inhibition, 
validating the functional significance of the underlying epigenetic lesions. Our study 
reveals how epigenetic alterations can drive oncogenic programs in the absence of 
canonical kinase mutations, with implications for mechanistic targeting of aberrant 


pathways in cancers. 


The human genomeis partitioned into physical domains, often termed 
topologically associated domains (TADs), by chromosomal boundaries 
established by the DNA-binding insulator protein CTCF and cohesin*®°. 
Many proto-oncogenes and master regulators are isolated in such 
domains and thus protected from promiscuous enhancer interactions”. 

Mutations of tricarboxylic-acid-cycle-related enzymes, including SDH 
andisocitrate dehydrogenase (IDH), are initiating events in many tumour 
types'+°. These lesions cause accumulation of succinate and 2-hydrox- 
yglutarate, respectively, which inhibit demethylases and are associ- 
ated with DNA hypermethylation and other epigenetic alterations*"”. 


The CTCF insulator is methylation-sensitive and may be displaced by 
DNA methylation” ». We previously showed that the PDGFRA oncogene 
is aberrantly activated by insulator defects in /DH-mutant glioma”. 
We hypothesized that SDH deficiency alters chromosome topol- 
ogy to drive GIST tumorigenesis (Fig. la). GISTs are the most common 
gastrointestinal tract sarcoma. They are typically caused by gain- 
of-function mutations of the KIT or PDGFRA oncogenes that render 
these receptor tyrosine kinases (RTKs) active and ligand-independent”’. 
However, approximately 15% of GISTs lack these defining mutations, and 
have instead lost SDH expression due to mutation or transcriptional 


‘Department of Pathology and Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA. 7Broad Institute of MIT and Harvard, Cambridge, 
MA, USA. °Center for Sarcoma and Bone Oncology, Dana-Farber Cancer Institute, Boston, MA, USA. “Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical 
School Boston, Boston, MA, USA. Department of Surgery, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA. °Experimental Therapeutics Core, Belfer Center for 
Applied Cancer Science, Dana-Farber Cancer Institute, Boston, MA, USA. ‘Department of Pathology, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA. 
®Department of Oncologic Pathology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA. °Ludwig Center at Harvard, Harvard Medical School, Boston, MA, USA. 
"Present address: The Lautenberg Center for Immunology and Cancer Research, IMRIC, Faculty of Medicine, The Hebrew University, Jerusalem, Israel. "These authors contributed equally: 
William A. Flavahan, Yotam Drier. *e-mail: yotam.drier@mail.huji.ac.il; George_Demetri@dfci.harvard.edu; Bernstein.Bradley@mgh.harvard.edu 


Nature | Vol575 | 7 November 2019 | 229 


Article 


@ Normal topology Pathological insulator loss b 


Inactive oncogene 


Variable CpG Islands 


| |] 


100 
Activated oncogene 


g 
© 
2 
S 0 
> = F 
£ 
Super- Safe 3 400 Variable CTCF sites 
Paired CTCF | 
aire {| enhancer { @nhancar g | | | | | 
ee CpG methylation a | 
Active insulator ei 
blocks enhancer prevents CTCF binding 0 Ah Ah 


NSM KiTmutant P SDH deficient 


c Lost in Gained in d , 
® 10-50,SDH deficient . SDH deficient a » 8 
eee Bs “FFA 
WL 49-40 | Methylated ©, of 4 oEGES 
5 ge EB » 
be 10-90 1 Yy 52 ‘ 
1 2 * ar 
8 49-20 2s BG KIT 
ra ' 0a bet 
§ : ec ao 
€ 10-0 : as O° 
= \ oS 2r 
a , ce  Y E 5 ; 
no [o} 
4 2 0 2 4 6 Zz 0 0 2 4 6 8 10 12 
Change in CTCF binding 3 Median expression in SDH deficient 
(log, normalized count) Lost Insulators (log,TPM) 
f g / ue h 
© T ws 
5 10 ae 1 \ = 
8 f \ ee 
cE 
$6 35 
84 | 86 
x = oN 
er a oe 
2 OF cs 25 
= aoe Ls ° 0 50 100 150 200 
ef & = Avg. expression in 
Ss SX SDH-intact GIST (TPM) 


EGFR JAK1 


Fig. 1| Insulator dysfunction in SDH-deficient GISTs. a, Proposed mechanism 
of epigenetic oncogene activation. Left, oncogene shielded from super- 
enhancer by CTCF insulator, which creates a topological boundary. Right, CTCF 
insulator displaced by DNA methylation, allowing the super-enhancer to contact 
and induce the oncogene. b, Violin plots depict DNA methylation levels over the 
10,000 most variable CpGisland promoters (top) and CTCF sites (bottom) in 
normal stomach muscle (NSM; n=2), and K/7 mutant (n= 9), PDGFRA mutant 
(P;n=2) and SDH-deficient GISTs (n= 6). Yellow bars indicate mean. c, Volcano 
plot depicts differential CTCF occupancy between SDH-deficient (n=6) and 
SDH-intact (n=8) GISTs. Sites that gain DNA methylation in SDH-deficient GISTs 
are indicated in red (>25% increase, two-sided t-test false discovery rate (FDR) 
<5%).d, Plot depicts H3K27ac peaks near lost CTCF insulators (y axis) rank 
ordered by signal strength. e, Scatter plot depicts genes (points) separated from 
asuper-enhancer byaCTCF loop anchor thatis lost in SDH-deficient GIST. Genes 
are positioned according to their relative (y axis) and absolute median 
expression (x axis) in SDH-deficient GISTs. Potentially deregulated gene targets 
(outliers) include oncogenes FGF3, FGF4 and KIT (red); see also Supplementary 
Information. TPM, transcripts per million. f, Box plot depicts average expression 
of MAPK signature genes in RNA-seq data for normal stomach (n= 262), and K/T 
mutant (n=10), PDGFRA mutant (n=3) and SDH-deficient GISTs (n=8). Boxes 
depict 25th, 50th and 75th percentiles, and whiskers depict extreme values. 

g, Radial phylogenetic tree depicts tyrosine kinase gene expression in SDH- 
deficient GISTs. Each branchis one tyrosine kinase, arranged by similarity, 

and with major families depicted by colour. The area of each red circle is 
proportional to the average expression of the kinase. h, Scatter plot depicts 
average expression of FGF ligands in SDH-intact (x axis) and SDH-deficient 
(yaxis) GISTs. FGF3 and FGF4 are highly expressed in SDH-deficient GISTs (bold). 
For all panels, n values indicate number of biologically independent specimens. 


silencing of SDH subunit genes (SDHA-D)'®. We collected an initial cohort 
of clinically defined specimens, including 11 K/7-mutant, 2 PDGFRA- 
mutant and 8 SDH-deficient tumours (Supplementary Table 1). We used 
hybrid-selection bisulfite sequencing to profile DNA methylation of 
over 160,000 CTCF sites and representative promoters in 17 of these 
tumours and 2 normal stomach muscle samples (see Methods). Consist- 
ent with previous reports’, SDH-deficient GISTs exhibited CpG island 
hypermethylation (Fig. 1b). In addition, a substantial fraction of CTCF 
sites were methylated in this GIST subtype (Fig. 1b). 

We next identified candidate insulators and enhancers in these 
tumours by mapping CTCF and histone H3 acetylated at lysine 
27 (H3K27ac) by chromatinimmunoprecipitation and sequencing (ChIP-— 
seq). Overall patterns of enhancer acetylation were largely consistent 
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across GISTs, relative to gastrointestinal carcinomas (Extended Data 
Fig. 1a). By contrast, comparison of genome-wide CTCF binding pro- 
files revealed that approximately 5% of sites were specifically lost in 
SDH-deficient GISTs (Fig. 1c). CTCF loss was accompanied by notable 
increases in DNA methylation at these sites (Fig. Ic and Extended Data 
Fig. 1b, c). Given that DNA methylation has been established to prevent 
CTCF binding” ®, this suggests that hypermethylation displaces CTCF 
from hundreds of candidate insulators in SDH-deficient tumours. 

To investigate the impact of CTCF loss on genome topology, we used 
HiC to map TADs and TAD boundaries genome-wide in GIST-T1, ahuman 
cell line with an oncogenic K/7 mutation and intact SDH expression”. We 
alsoused HiChIPto map CTCF loopsand loopanchors, whichcorrespond 
to TADs and boundaries, respectively” (Extended Data Fig. 1d). We used 
these maps to predict insulator losses likely to alter topology and gene 
expression. Of the 1,236 sites that lose CTCF and gain methylation in 
SDH-deficient GISTs, 688 corresponded to loop anchors. We reasoned 
that disruption of these loop anchors could alter topology and, incer- 
tain cases, permit aberrant enhancer-promoter interactions (Fig. 1a). 
Therefore, we further curated this list using enhancer maps and expres- 
sion data. This highlighted 60 CTCF loop anchors that would normally 
have partitioned alarge ‘super-enhancer froma gene, but that were lost 
in SDH-deficient GISTs (Fig. 1d,e and Supplementary Table 2). Top hits 
included lost CTCF insulators in the FGF3 and FGF4 locus (chromosome 
11q13) and the K/Tlocus (chromosome 4q12) (Extended Data Fig. 2a, b). 

Although SDH-deficient GISTs lack KIT or PDGFRA mutations’, our 
insulator analysis raised the possibility that RTKs may instead be epi- 
genetically deregulated. This prompted us to examine the expression 
of RTKs, ligands and downstream signalling programs. First, we found 
that asignature for mitogen-activated protein kinase (MAPK) targets is 
highly expressed and suggestive of active RTK signalling in SDH-deficient 
GISTs (Fig. 1f; see Methods). Second, a systematic analysis of tyrosine 
kinase gene expression revealed that K/T and FGF receptor 1 (FGFR1) are 
the most highly expressed RTKs in SDH-deficient GISTs (Fig. 1g). Third, 
we found that FGF3 and FGF4 were expressed at remarkably high lev- 
els, and were both specific to the SDH-deficient subtype (Fig. 1h). FGF3 
and FGF4are established oncogenes”, and FGF signalling could help to 
explain the poor efficacy of KIT inhibitors in SDH-deficient GISTs””’. 
We therefore investigated the mechanisms underlying this striking and 
specific upregulation of FGF ligands. 

FGF3 and FGF4 reside in an approximately 250 kb TAD flanked by 
boundaries that contain CTCF-binding sites (Fig. 2a). The adjacent TAD 
onthe 11q side contains a large cluster of enhancers or super-enhancer. 
This super-enhancer overlaps the gene ANOJ, which encodes the GIST 
clinical biomarker also known as DOG-1 (‘discovered on GIST-1’)**. The 
super-enhancer is highly acetylated and ANO1 is highly expressed in 
all GIST subtypes (Extended Data Fig. 2a). Notably, the TAD boundary 
that partitions this super-enhancer from the FGF genes, which we refer 
to as the ‘FGF insulator’, contains several CTCF-binding sites (Fig. 2a). 

We hypothesized that disruption of CTCF binding could compromise 
the FGF insulator and allow the ANOIJ super-enhancer to contact and 
activate the FGF genes. The FGF insulator contains two strong and several 
weak CTCF-binding sites. Although these sites are consistently boundin 
KIT-and PDGFRA-mutant tumours and normal stomach muscle control 
samples, all five are markedly reduced in SDH-deficient GISTs (Extended 
Data Fig. 2c). The strongest CTCF-binding site, which is closest to the 
ANO1 super-enhancer, is almost completely lost in the SDH-deficient 
samples (Extended Data Fig. 2d). Consistently, itis methylated specifi- 
cally in SDH-deficient tumours. This suggests that the FGF insulator has 
switched to a methylated state that occludes CTCF binding. 

To assess the impact of CTCF loss on boundary integrity, we 
performed circularized chromatin conformation capture sequencing 
(4C-seq) on four SDH-intact and three SDH-deficient GISTs. We designed 
a ‘viewpoint’ primer that enabled us to quantify contacts between a 
central position in the ANOJ super-enhancer and other genomic posi- 
tions at high resolution (Fig. 2b). In SDH-intact tumours and stomach 
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Fig. 2 | FGF3-FGF4 locus topology reorganized in SDH-deficient GISTs. 

a, Genomic views of the FGF3-FGF4 and ANO1 loci depict baseline chromosome 
topology (HiC, red), genes (blue), CTCF-CTCF loop interactions (HiChIP, arcs, 
with darkness indicating significance), CTCF binding (ChIP-seq, orange) and 
candidate enhancers (H3K27ac ChIP-seq, green). HiC/HiChIP data are for 

the SDH-intact model GIST-T1, whereas CTCF and H3K27ac dataare for 
representative clinical specimens (see also Extended Data Fig. 2). ANOJ super- 
enhancer (green bar) and FGF insulator (orange shading) are indicated. b, Traces 
depict 4C-seq interaction frequency (yaxis) between the ANOJ super-enhancer 
viewpoint primer (dashed white line) and genomic positions in the FGF3- 
FGF4-ANO1 locus (x axis). Data are shown for SDH-intact GISTs (n =4; top), 
normal stomach muscle (n= 1; grey line, top) and SDH-deficient GISTs (n= 3; 
bottom). CTCF binding profiles for representative SDH-intact (top) and SDH- 
deficient (bottom) tumours are also shown (orange). Genes (blue) and CTCF 
sites in the FGF insulator (orange) are highlighted.c, d, Plots depict relative FGF4 
(c) and FGF3(d) expressionin GIST-T1cells expressing CRISPR-Cas9 and control 
sgRNA (black) or sgRNAs targeting the two CTCF sites in the FGF insulator (red). 
Bars indicate mean of three biologically independent replicates (dots). Pvalues 
by two-sided t-test. 


muscle control samples, we detected robust interactions throughout 
the ANO1TAD, but not beyond its boundaries, consistent with robust 
FGF insulator function. In SDH-deficient tumours, however, the same 
super-enhancer viewpoint physically interacted with sequences well 
beyond the boundary, including with the FGF3 and FGF4 genes, which 
are ~200 kb from the viewpoint (Fig. 2b and Extended Data Fig. 3a-c). 
These data suggest that FGF3-FGF4 locus topology is profoundly altered 
in SDH-deficient GISTs, with CTCF insulator loss allowing aberrant con- 
tacts between the ANO/I super-enhancer and FGF ligand genes. 

Toassess directly whether FGF insulator loss affects FGF gene expres- 
sion, we used genome editing to disrupt the insulator in GIST-T1 cells, 
which harbour a GIST-like enhancer landscape and retain CTCF binding 
and boundary function. We used CRISPR-Cas9 and short guide RNAs 
(sgRNAs) to edit the motifs underlying the two strongest CTCF sites 
in the FGF insulator (Extended Data Fig. 3d). This resulted ina sixfold 
induction of FGF4, and a35-fold induction of FGF3 (Fig. 2c, d). These data 
directly link insulator loss to the marked upregulation of FGF ligands 
in SDH-deficient GISTs. 

Notably, aswitch between CTCF-bound and DNA-methylated insula- 
tor states underlies genomic imprinting’. FGF insulator loss might 
therefore also represent a stable epigenetic event or ‘epimutation’ that 
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Fig. 3 | KIT-PDGFRA locus topology reorganized in SDH-deficient GISTs. 

a, Genomic views of PDGFRA and K/T loci depict baseline chromosome topology 
(HiC, red), genes (blue), CTCF-CTCF loop interactions (HiChIP, arcs), CTCF 
binding (ChIP-seq, orange) and candidate enhancers (H3K27ac ChIP-seq, 
green). HiC/HiChIP data are for the SDH-intact GIST model GIST-T1, whereas 
CTCF and H3K27ac data are for representative clinical specimens (see also 
Extended Data Fig. 2). KIT super-enhancer (green bar) and KIT insulator (orange 
shading) are indicated. b, Traces depict 4C-seq interaction frequency (y axis) 
between the KIT super-enhancer viewpoint primer (dashed white line) and 
genomic positions in the K/T-PDGFRA locus (x axis). Data are shown for SDH- 
intact GISTs (n=6, top) and SDH-deficient GISTs (n=3, bottom). CTCF profiles 
for representative SDH-intact (top) and SDH-deficient (bottom) tumours are 
also shown. Genes (blue) and CTCF-binding sites inthe KIT insulator (orange) are 
highlighted.c, Traces depict 4C-seq interaction signal between the KIT super- 
enhancer viewpoint primer and theK/7T gene in GIST-Tl cells expressing Cas9 and 
control (black) or KIT insulator targeting sgRNAs (red). d, Plot depicts relative 
KIT expression in GIST-T1 cells expressing Cas9 and control (black) or KIT 
insulator targeting sgRNAs (red). Bar indicates mean of three biologically 
independent replicates (dots). Pvalues by two-sided f-test. e, f, FGF and KIT 
insulator fraction methylation (B-value) evaluated in an expanded cohort of 
GIST tumours by locus-specific bisulfite sequencing. e, Bar plot depicts average 
methylation levels across six CpGs within FGF insulator CTCF peak 2 in normal 
stomach muscle (NSM; n= 2), SDH-intact GISTs (n =17) and SDH-deficient GISTs 
(n=11).f, Bar plot depicts average methylation levels across four CpGs within KIT 
insulator CTCF peak 2 in normal stomach muscle (n= 2), SDH-intact GISTs 
(n=20) and SDH-deficient GISTs (n =12) (n values indicate number of 
biologically independent tumours). 


effects a single allele. In five of our SDH-deficient samples, we identified 
heterozygous single nucleotide polymorphisms (SNPs) within an FGF3 
or FGF4 exon. In four of these cases, analysis of the informative SNP in 
RNA-seq data revealed that the FGF ligand gene was mono-allelically 
expressed (Extended Data Fig. 4a-c). By contrast, ANOI was bi-allelically 
expressed, suggesting that the biased FGF expression reflected allele- 
specific insulator loss. Consistently, inone SDH-deficient tumour with 
a heterozygous SNP near the CTCF site, we confirmed that only one 
allele of the FGF insulator was methylated (Extended Data Fig. 4d).Ina 
second tumour with an informative SNP near the ANO/J super-enhancer 
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Fig. 4 | SDH-deficient GIST PDX trial confirms dependence on FGF signalling. 
a, Specimen collection and generation of PDX model. b, Scatter plot compares 
expression of genes (points) between primary tumour S1 (xaxis) and PDX (yaxis) 
per RNA-seq. Pearson correlation is indicated. c, Venn diagram depicts overlap 
between strong H3K27ac ChIP-seq peaks in primary tumour (black) and PDX 
(green). d, Venn diagram depicts overlap between hypermethylated CpG 
islands in primary tumour (black) and PDX (red) per bisulfite sequencing. 

e, Scatter plot depicts principal component analysis (PCA) on the top 1,000 
differential CTCF sites for primary tumours (coloured by subtype) and PDX 
(star). PDX and originating tumour (S1) both cluster with SDH-deficient GISTs 


4C-seq viewpoint primer, we confirmed that the aberrant interaction 
between super-enhancer and FGF4 was also strongly biased to one 
allele (Extended Data Fig. 4e, f). These data suggest that insulator loss, 
topological reorganization and FGF induction reflect a stable epigenetic 
alteration propagated in the malignant clone. 

In addition to the FGF insulator, our screen identified a top-ranked 
CTCF insulator loss in the K/T locus. This hit was of interest given that 
KITis an established GIST oncogene, and given prior reports of cross- 
talk between FGF and KIT signalling”””*. HiC and HiChIP data reveal that 
the KIT gene is contained within a ~600 kb TAD (Fig. 3a). This large TAD 
contains within it a smaller insulated domain (-100 kb) that is flanked 
by CTCF sites. This smaller domain harbours a large super-enhancer 
that is highly acetylated in all GIST specimens examined (Extended 
Data Fig. 2b). Itis partitioned from KIT by atopological boundary that 
we refer to as the KIT insulator. 

The KIT insulator contains two strong CTCF sites separated by around 
7 kb. Both sites gain methylation and lose CTCF binding in SDH-deficient 
GISTs (Extended Data Fig. 2e, f). To determine whether CTCF loss is 
associated with altered K/Tlocus topology, we performed 4C-seq using a 
viewpoint primer intheinsulated super-enhancer (Fig. 3b). In SDH-intact 
tumours, the super-enhancer engagesin robust interactions throughout 
the small insulated domain, but not beyond its boundaries (Extended 
Data Fig. 5a, b). In SDH-deficient tumours, however, the super-enhancer 
interacts with sequences well beyond the KIT insulator (Extended Data 
Fig. 5c, d), consistent with loss of CTCF binding and boundary function. 
Notably, quantification of interaction signals in SDH-deficient tumours 
indicated that approximately 15-20% of interactions made by this super- 
enhancer viewpoint are with the K/T promoter and gene, compared with 
1-5% in SDH-intact tumours (Extended Data Fig. 5d). 

Totest directly whether CTCF loss alters K/Tlocus topology, we edited 
the two CTCF motifs in the KIT insulator in GIST-T1 cells (Extended Data 
Fig. 5e) and evaluated locus topology by 4C-seq. Although the KIT 
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insulator boundary was clearly evident in control GIST-T1 cells (Fig. 3a), it 
was compromised in the edited cells, as demonstrated by frequent con- 
tacts between super-enhancer and KIT (Fig. 3c). We also considered the 
impact ofinsulator loss on K/T expression. Although GIST-Tl1 cells already 
highly express a constitutively active form of this oncogene, we found 
thatinsulator disruption further increased K/7 expression by around 50% 
under culture conditions that partially mimic SDH deficiency (Fig. 3d and 
Extended Data Fig. 6). Although this proportional increase is modest, 
it corresponds toa substantial increase in transcriptional output given 
high baseline K/7T expression in GIST-T1 cells. 

Our hypothesis that FGF and KIT insulator losses drive SDH-deficient 
GIST predicts that these insulators should be recurrently disabled, and 
the corresponding oncogenes consistently expressed across tumours. 
We therefore examined insulator methylation across 32 GIST specimens 
from our original cohort and an additional validation cohort. Bothinsu- 
lators were highly methylated in all SDH-deficient cases, but not in any 
SDH-intact tumours or normal controls (Fig. 3e, f). Consistently, CTCF 
binding to these insulators was compromised in all six SDH-deficient 
GISTs evaluated, but retained in all SDH-intact tumours andnormalstom- 
ach muscle controls (Extended Data Fig. 2c-f). Furthermore, these CTCF 
sites were consistently unmethylated across multiple non-malignant 
cell and tissue types, including a population enriched for interstitial 
cells of Cajal (ICCs), the presumed GIST cell of origin (Extended Data 
Fig. 7a). Finally, FGF4 is consistently expressed across SDH-deficient 
tumours yet it is only expressed at very low or undetectable levels in 
KIT-mutant GISTs, PDGFRA-mutant GISTs and ICCs (Fig. hand Extended 
Data Fig. 7b). The recurrence and specificity of these insulator losses 
support their functional significance in SDH-deficient GISTs. 

Finally, we evaluated directly whether signalling through FGFR and/ 
or KIT is required for tumour growth. Although we are unaware of 
any in vitro SDH-deficient GIST models, we successfully established 
an in vivo PDX model from one of our SDH-deficient GIST specimens 


(KITand PDGFRA wild type) (Fig. 4a). Model and parental tumour have 
remarkably similar RNA expression, H3K27ac enhancer landscapes, meth- 
ylation and CTCF binding profiles (Fig. 4b-e). The PDX also maintains 
characteristic enhancers and CTCF insulator losses in the FGF and KIT 
loci, and strongly expresses FGF3, FGF4 and KIT (Extended Data Fig. 8a, b). 
These data support the fidelity of this SDH-deficient GIST model. 

We therefore tested the efficacy of FGFR and KIT inhibitors in this 
model. We used BGJ-398, a potent and selective inhibitor of FGFR1-4 
in clinical development”, and sunitinib, a drug approved for GIST with 
potent activity against unmutated KIT”. We dosed PDX mice for 28 days 
with BGJ-398 (20 mg kg”), sunitinib (40 mg kg”) or acombination of the 
two (Fig.4f).Singleagentsunitinib minimally suppressedtumourgrowth, 
consistent with the drug-resistant phenotype of the SDH-deficient 
GIST subtype and prior reports that cross-talk between FGF and KIT 
signalling confers resistance to KIT inhibition’. By contrast, single 
agent BGJ-398 completely suppressed tumour growth throughout the 
dosing period, strongly supporting a critical role for FGF signalling in 
tumorigenesis (Fig. 4g and Supplementary Table 3). Sensitivity to FGFR 
inhibitionis specifictothis GIST subtypeas BGJ-398 lacks efficacy against 
SDH-intact PDX models”*. Notably, the combination of FGFR and KIT 
inhibitors resulted in the most durable response, with growth suppres- 
sion well beyond the dosing period (Fig.4g and Extended Data Fig. 8c, d). 
These pre-clinical data indicate that both RTK signalling pathways 
drive SDH-deficient GIST, and strongly support the significance of the 
underlying epigenetic lesions. 

Inconclusion, we identify multiple epigenetic lesions that converge 
to activate RTK signalling and proliferation in SDH-deficient GIST. We 
show thatthe characteristic hypermethylation in these tumours is asso- 
ciated with pervasive insulator losses, topological reorganization of 
the FGF and K/T loci, and particularly potent induction of the FGF4 and 
FGF3 oncogenes. Although our data donot address the precise cellular 
contexts in which these lesions arise, it is notable that KIT signalling 
regulates proliferation of the presumed GIST cell of origin, ICC”. Simi- 
larly, the ANO1 gene, the super-enhancer of which aberrantly drives FGF3 
and FGF4 expression, encodes anion channel that is highly expressed 
and essential for ICC”®”’. Hence, topological changes that deregulate 
FGF and KIT expression could lead to unchecked signalling in these 
precursors. Although the corresponding loci are genetically wildtypein 
SDH-deficient GIST, the functional significance of their deregulationis 
supported by the prevalence of gain-of-function KIT mutations in SDH- 
intact GISTs and by arecent report that FGF4is genetically amplifiedina 
rare subset of KIT/PDGFRA/SDH/RAS quadruple wild-type GIST”°. Our 
pre-clinical PDX data substantiate their significance and establish proof 
of concept for therapeutic intervention. Given that few stable epigenetic 
events have been established as drivers of tumorigenesis’, our nomina- 
tion of two novel functional lesions represents an important advance. 
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Methods 


Primary GIST specimens and cell culture models 

Epigenetically characterized clinical samples were obtained as frozen 
specimens from Brigham and Women’s Hospital or from the Massachu- 
setts General Hospital Pathology Tissue Bank. The validation cohort was 
obtained as FFPE samples from the BWH tissue bank. All samples were 
acquired with Institutional Review Boardapproval,andwerede-identified 
before receipt. PDGFRA and KIT mutational status were confirmed 
through Sanger sequencing for frozen specimens, while SDH status 
was determined by immunohistochemistry (details below). 

Samples were also examined via RNA-seq and ChIP-seq input controls 
(details below) in order to look for mutations or copy number changes 
in all FGF ligand and receptor genes—no copy number alterations were 
found and no sequence variants were detected other than known anno- 
tated SNPs. 

The GIST-T1 cell line was obtained from Cosmo Biosciences”. Cells 
were passaged in DMEM with 10% serum, 1x antibiotics and 1x Glutamax 
(Life Technologies). For pseudohypoxia experiments, cells were treated 
with 200 uM deferoxamine mesylate (Sigma) or vehicle control (water) 
for 72h. For succinate conditions, cells were cultured in 20 pM dimethyI- 
succinate (Sigma), which was slowly added to acell culture dish contain- 
ing standard media and allowed to dissolve before addition of cells. 


Chromatin immunoprecipitation 

ChIP-seq was performed as described previously. In brief, cultured 
cells or minced frozen tissue were crosslinked in 1% formaldehyde and 
snap frozen in liquid nitrogen before storage at -80 °C for at least over- 
night. Sonication of samples were calibrated such that DNA was sheared 
to between 300 and 700 bp fragment length. CTCF was precipitated 
with a monoclonal rabbit CTCF antibody, clone D31H2 (Cell Signaling 
no. 3418). Histone H3K27 acetylation was immunoprecipitated with 
antibody from Active Motif (no. 39133). ChIP DNA was used to generate 
sequencing libraries by end repair (End-It DNA repair kit, Epicentre), 3’ 
A base overhang addition via Klenow fragment (NEB) and ligation of 
barcoded sequencing adapters. Barcoded fragments were amplified 
via PCR. Libraries were sequenced as 38-bp end reads on an Illumina 
NextSeq500 instrument. Processed genomic data has been deposited 
into GEO under accession number GSE107447, while raw sequencing 
data has been deposited into dbGaP (phs001906.v1.p1). See also Sup- 
plementary Table 4. 

Reads were aligned to the reference genome (hg38) using BWA aln 
version 0.7.4”, removing reads with mapping quality score <10. For 
H3K27 acetylation ChIP-seq and input controls, PCR duplicates were 
removed by Picard toolkit 2.9.2. Peaks were called with HOMER 4.9” 
against input controls. To call all H3K27ac peaks, we used ‘histone’ 
settings. To call super-enhancers”, we used ‘super’ settings and no 
local filtering. CTCF peaks were called with ‘factor’ settings. To meas- 
ure H3K27ac correlations, signal at the union of the peaks (5 kb win- 
dow around the centre) was calculated by featureCounts 1.6.2**. We 
downloaded and reprocessed publicly available data of other gastro- 
intestinal tumours for comparison (GSM1969645*, GSM1969657*, 
GSM2058055*°, GSM2058056*°, GSM2131266” and GSM2131280”). 
The dendrogram is based on unweighted average distance linkage 
of the Pearson correlations between the 10,000 most variable peaks, 
although analysis results were similar when comparing correlations 
over all peaks. 


Hybrid selection bisulfite sequencing 

Hybrid selection probes were designed to capture -160,000 CTCF- 
binding sites, and ~5,000 promoters. CTCF bind sites lists were col- 
lated from ENCODE (as downloaded from UCSC genome browser, table 
wgEncodeRegTfbsClusteredV3, Release 4) as well as additional CTCF 
maps of primary cholangiocarcinomaand glioma”. Total genomic DNA 
was isolated using the DNAeasy Blood & Tissue Kit (Qiagen) and sheared 


using the Covaris LE220. Ampure XP beads (Agencourt) were used to 
size select gDNA fragments within 150-320 bp and sheared distribu- 
tion was verified via BioAnalyzer (Agilient). One microgram of gDNA 
was end repaired, 3’ A base tailed (KAPA Hyper Prep Kit no. KK8502) 
and ligated to sequencing adaptor (Roche SeqCap Epi Enrichment 
System). Ligated products were purified using Ampure XP beads. 
Following bead clean-up, products were bisulfite-converted using 
the EZ DNA Methylation-Lightning Kit (Zymo Research) and then 
PCR amplified using KAPA HiFi U+ HotStart ReadyMix (KAPA no. 
KK2800). Equal concentrations of each library were then combined 
insets of three or four along with SeqCap Epi universal and indexing 
oligos and bisulfite capture enhancer (SeqCap Epi Accessory Kit). 
Each pool was lyophilized using TOMY Micro-Vac (MV100), resus- 
pended in hybridization buffer (SeqCap Epi Hybridization and Wash 
Kit), and then hybridized to SeqCap Epi Probe Pool (Roche) for 72h 
at 47 °Cinathermocycler. Following the 72 h incubation, captured 
bisulfite-converted libraries were recovered (SeqCap Pure Capture 
Bead Kit) at 47 °C ina thermocycler for 45 min, with intermediate 
vortexing. Capture beads were washed (SeqCap Hybridization and 
Wash Kit) ina47 °C water bath. Captured bisulfite-converted libraries 
were then amplified via PCR (SeqCap Epi Accessory Kit). Libraries 
were sequenced with 10% PhiX spike-in as 100-bp end reads onthe 
HiSeq2500 in rapid run mode. 

Hybrid-selection bisulfite sequencing (HSBS) data were processed 
by methyICtools 0.9.4”, using BWA mem version 0.7.12, and aligned to 
human reference hg38. Owing to the sizes of the captured fragments, 
probe capture resulted in an effective coverage of about 600 bp around 
CTCF sites. PCR duplicates were removed by Picard toolkit 2.9.2. DNA 
methylation levels were called by methyICtools 0.9.4 in loci covered 
by at least five reads. Methylation at 36,281 CTCF-binding sites that are 
bound in GIST tumours and covered by the assay were used for down- 
stream analysis. 


HiC and HiChIP 

In situ HiC was performed as described’. CTCF HiChIP was performed as 
described”®. In brief, 3 tubes of ~5 million GIST-T1 cells were crosslinked 
in1% formaldehyde (2 replicates for HiChIP, 1 for HiC). For HiChIP, cells 
were lysed using HiC lysis buffer as described. Chromatin was digested 
with 375 U Mbol restriction enzyme (NEB, RO147). After heat inactiva- 
tion of restriction enzyme and marking of ends with biotin-14-dATP 
(Life Technologies, 19524-016), DNA was ligated using T4 buffer (NEB, 
BO202). Chromatin was sheared by Covaris LE220. Chromatin immu- 
noprecipitation was performed with 30 pl of monoclonal rabbit CTCF 
antibody, clone D31H2 (Cell Signaling 3418). Protein was bound by Pro- 
tein G beads and after washing was incubated in DNA elution buffer. 
Eluant was treated with proteinase, crosslinks were reversed and the 
sample was Zymo purified (Zymo DNA Clean and Concentrator D4003). 
Biotinylated DNA was pulled down with M280 Streptavidin beads (Inv- 
itrogen 11205D) and the DNA was fragmented with Tn5 and libraries 
were constructed with Nextera kit (Illumina). For HiC, cells were lysed 
in HiC lysis buffer and chromatin was digested with 200 U Mbol (NEB, 
RO147) overnight. Ends were marked and DNA was ligated as in HiChIP. 
DNA was precipitated and sheared by Covaris E220. Biotinylated DNA 
was pulled down by T1 Streptavidin beads (Life Technologies, 65602). 
End-repair, A-tailing and adaptor ligation were performedas described’. 
Libraries were prepared using Phusion High-Fidelity DNA Polymerase 
(NEB, MOS30). HiChIP and HiC libraries were sequenced as 75-bp end 
reads onan Illumina NextSeq500 instrument. Data were processed 
using HiC-Pro” and visualized by the WashU EpiGenome Browser” and 
the R package Sushi“. 

The two HiChIP replicates showed high similarity, and therefore 
were merged for the rest of the analysis. CTCF-CTCF loops were called 
from HiChIP data with hichipper 0.7.3”. Only loops with FDR <5% 
and supported by at least 5 reads were considered for downstream 
analysis. 


4C analysis 

4C analysis was performed using methods adapted from published 
protocols*. In brief, ~10 million cells from culture or frozen minced 
tumour specimens were crosslinked in 2% formaldehyde. Fixed sam- 
ples were lysed in lysis buffer containing protease inhibitor cocktail 
and mechanically disrupted using a Biomasher tissue grinder (Kimble 
Chase). Lysis was confirmed using methyl green-pyronin staining. Fol- 
lowing lysis, nuclei were digested with Nlalll (NEB) overnight at 37 °C 
in athermomixer set to 950 rpm. After heat inactivation of restriction 
enzyme, diluted nuclei were ligated using T4 DNA ligase (NEB) overnight 
at 16 °C, followed by RNase and proteinase K treatment. Isolated DNA was 
then digested overnight in Csp6l (Thermo) at 37 °C, diluted and ligated 
overnight at 16 °C in order to circularize fragments. Efficacy of each 
ligation and digestion step was verified via agarose gel electrophoresis. 
Purified circularized DNA was used as input in PCR reactions to create 
sequencing libraries. Sixteen reactions per sample were performed, 
each using 200 ng of circularized 4C DNA (3.2 1g total) in 50 pl using 
QS high-fidelity PCR mastermix (NEB). Primers contained sequencing 
adapters and barcodes, and annealing sections were as follows: K/T 
enhancer viewpoint primer: 5’-TTTCTATT TGCTCGTTCATG-3’; K/Tnon- 
viewpoint primer: 5’-GGAAACT TCCAAAGTAGGCT-3’; ANOJ enhancer 
viewpoint primer: 5’-ATGTCGCCCTCCTGCATG-3’; ANOJ non-viewpoint 
primer: S5’-AGACAAATGAGGCCTGGACG-3’; ANO1 viewpoint primer for 
SNP detection: 5’-CTCAAACAGACACTCACATG-3’; ANO1 non-viewpoint 
primer for SNP detection: 5’-TCTTTTTGGTTGGATTGTAGGAGT-3’. 
Standard 4C sequencing libraries were sequenced as 38-bp end reads 
onan Illumina NextSeq500, although only the first read (the viewpoint 
primer read) was used for further processing. 4C sequencing libraries 
for detecting SNPs were read as 75-bp end reads on the same machine, 
and the second read was used for SNP detection. Data were analysed via 
4Cseqpipe“*, and median normalized data witha main trend resolution 
of 22.5 kb were visualized in R. 


RNA-seq 

Total RNA was isolated from clinical GIST samples using the RNeasy 
Plus Kit (Qiagen) and quality was determined via Bioanalyzer (Agilent). 
Libraries were prepared using the TruSeq Stranded mRNA Library Prep 
Kit (Illumina), and equimolar multiplexed libraries were sequenced with 
single-end 75 bp reads onan Illumina NextSeq 500. 

Reads were aligned using STAR 2.5.3” to the human reference (hg19). 
RNA-seq data for SDH intact GISTs were previously published**. Gene 
expression was estimated by featureCounts 1.6.2**. TPM values were 
calculated for these data sets”. RNA-seq TPM values for normal stomach 
was downloaded from GTEx v7“. 


Statistical analysis and reproducibility 

CTCF peaks of all GISTs were merged by bedtools merge 2.26”, includ- 
ing only peaks with a score >50 and in the top 50,000 as reported by 
HOMER. Peaks were then centred around CTCF motif where found by 
FIMO (MEME 4.7)°° at a100 bp window around the peak centre, based on 
JASPAR 2014 CTCF motif MA0139.1 bases 4-19°.. If multiple motifs were 
detected, we kept the one with the highest score. Reads were counted 
by HTSeq 0.6.1”. CTCF profiles were normalized by copy number esti- 
mates and across samples by standard median ratio. Copy number val- 
ues were estimated from input by CNVnator 0.3%, with 5 kb bins. CTCF 
sites bound inall samples were used for median ratio normalization as 
implemented by DESeq2™, wherea site is considered bound inany given 
sample if its signal is at least 0.25 of the median of the top 10,000 sites 
for that sample. Normalization factors were used to scale CTCF signal 
for visualization (Figs. 2, 4) and differential analysis. Differential CTCF 
binding was called by DESeq2™, identifying 2,106 CTCF-binding sites that 
significantly lose CTCF binding in our cohort (FDR<5%, fold change > 2). 
To estimate methylation at lost insulators sites we measured average 
methylation over a250 bp window around the peak centre. We identified 


4,502 sites that significantly gain methylation (FDR <5% as determined 
by t-test, methylation increase >25%). Sites that both significantly lose 
CTCF binding and gain DNA methylation were considered ‘perturbed’ 
for downstream analysis. We focused on CTCF-CTCF loops that overlap 
with a perturbed CTCF site and either (1) the loop contains a promoter 
thatis insulated from a super-enhancer (<500 kb away) by the perturbed 
CTCF site; or (2) the loop contains a super-enhancer (at least half of the 
super-enhancer resides in the loop) that is insulated from a promoter 
(<500 kb away) by the perturbed CTCF site. Here we only considered 
super-enhancers that scored in at least two SDH-deficient GISTs. This 
resulted in the identification of 60 putatively lost insulators with loss of 
CTCF and >50% methylation gain, and 167 putatively deregulated genes 
with >1 TPM median expression across SDH-deficient GIST, 25 of which 
with >100 TPM (Fig. le and Supplementary Table 2). 

To test correlations between methylation and CTCF binding we 
focused only on peaks detected in at least one GIST sample and with 
an annotated CTCF motif (see above). To empirically estimate the null 
distribution of the correlation coefficient, CTCF-binding sites were 
permuted 100 times (Extended Data Fig. Ic). 

To derive GIST MAPK activity score, we used previously identified 
MAPK biomarkers”, and published data of Imatinib treatment of a GIST 
line®’. We picked biomarkers that were downregulated by the Imatinib 
treatment (P< 0.01 by t-test, fold change >2), and expressed inall primary 
GISTs (>5 TPM). This yielded four biomarkers: DUSP6, ETV5, SPRY2 and 
SPRY4. Final MAPK score was computed by summing z-scores of the 
four genes and dividing by 2, as suggested”. 

For Figs. 2a and 3a, representative ChIP-seq traces were selected 
from ChIP-seq profiles for the 11 K/7 mutant and 6 SDH-deficient GISTs 
characterized, all of which displayed similar results (see Supplementary 
Table 1 and Extended Data Fig. 2). 

For all CRISPR insulator deletions, viral introduction of CRISPR- 
Cas9 + sgRNA vector was repeated three times with separate viral prepa- 
rations and infections to generate biologically independent replicates. 

For epigenomic and transcriptomic characterization of clinical tissue 
(for example, ChIP-seq, 4C, RNA-seq), multiple clinical specimens were 
analysed, but technical replicates could not be performed on individual 
samples due to limited availability of material. 

For analysis of clinical tissue, no statistical methods were used to 
predetermine sample size; rather, all available SDH-deficient tumour 
specimens with validated SDH loss and enough material available for 
analysis were tested. For mouse studies, no specific statistical calcula- 
tions were performed; rather, sample size was determined based on 
prior experience with similar PDX trials. The investigators were not 
blinded to allocation during experiments and outcome assessment. 


Immunohistochemistry 

Immunohistochemistry was performed on 4-um-thick paraffin-embed- 
ded tissue sections using a mouse anti-SDHB monoclonal antibody 
(clone 21A11AE7;1:200 dilution; Abcam), arabbit anti-PDGFRA monoclo- 
nal antibody (clone D13C6; 1:300 dilution; Cell Signaling Technology), 
and arabbit anti-KIT polyclonal antibody (1:150 dilution; Dako). Pressure 
cooker antigen retrieval in citrate buffer (pH 6.1; Dako Target Retrieval 
Solution) was used for PDGFRA and SDHB. Dako Envision+ secondary 
antibody was used. The sections were developed using 3,3’-diamin- 
obenzidine as substrate and counterstained with Mayer’s haematoxylin. 


CRISPR-Cas9 insulator disruption 

The following CRISPR sgRNAs were cloned into the LentiCRISPR vec- 
tor®’: sgGFP 5’°-GAGCTGGACGGCGACGTAAA-3’; sgKIT_CTCFpeak1 
5’-GTCTCTCTTCTGCCAGCAGG-3’; sgKIT_CTCFpeak2 5’-GACTTCC 
CTGACACTAGATG-3’; sgFGF_CTCFpeak1 5’-GTCCCACTGCCACC 
ACAAGA-3’;sgFGF_CTCFpeak25’-GGGCCAGGCCCGCCGCCAGG-3’;sgSD 
HB 5’-GTGTCTCTTTCAGGCATCTG-3’. sgRNAs were designed to either 
the GG PAM in the consensus CTCF motifs for CTCF disruption, ortoa 
PAM near the’ splice junction of exon 4 of the SDHB gene. GIST-T1 cells 
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were infected with the relevant lentivirus(es) for 48 h. Cells were then 
selected in 2 1g mI“ puromycin for 4 days, with puromycin-containing 
media refreshed every 2 days. Cells were allowed to recover from puro- 
mycin for1week before analysis. Genomic DNA was thenisolated andthe 
region ofinterest was amplified using primers with sequencing adaptors 
and the following annealing regions: KIT CTCF Peak1 Forward S’-TTTGGG 
ATTCGAGTGACCAC-3’; KIT CTCF Peak1 Reverse 5’-T TCAGGGCTCAACAG 
CTTCA-3’; KIT CTCF Peak2 Forward 5’-GGAAATAACCTCAACCGGTG-3’; 
KIT CTCF Peak2 Reverse 5’-GACTCGGTCTTGCTCCTCTAA-3’. Libraries 
were sequenced on an Illumina NextSeq500 as 38 bp end reads, and 
analysed for editing efficiency. Crosslinked cells were also harvested 
for ChIP analysis to verify loss of CTCF binding. 


Quantitative real-time polymerase chain reaction 

Total RNA was isolated from GIST-T1 cells using the RNeasy minikit 
(Qiagen) and used to synthesize cDNA with the SuperScriptlll system 
(Invitrogen). cDNA was analysed using the SYBR mastermix (Applied 
Biosystems) ona 7500 Fast Real Time system (Applied Biosystems). Gene 
expression primers were as follows: K/T forward 5’-GCACAATGGCACGG 
TTGAAT-3’; KIT reverse 5’-GGTGTGGGGATGGATTTGCT-3’; KITLG 
forward 5’-AGCGCTGCCTTTCCTTATGA-3’; KITLG reverse 5’-CCGGGGAC 
ATATTTGAGGGT-3’; EPAS1 forward 5’-CCACCAGCTTCACTCTCTCC-3’; 
EPAS1 reverse 5’-TCAGAAAAAGGCCACTGCTT-3’; FGF4 set 1 forward 
5’-CCAACAACTACAACGCCTACGA-3’; FGF4set 1lreverse5’-CCCTTCTTGG 
TCTTCCCATTCT-3’; FGF4 set 2 forward 5’-GCAGCAAGGGCAAGCTCT 
AT-3’; FGF4set 2 reverse 5’-CGGT TCCCCT TCT TGGTCTT-3’; FGF3forward 
5’-ATGCT TCGGAGCACTACAGC-3’; FGF3 reverse 5’-CACGTACCACAG 
TCTCTCGG-3’. All gene expression results were normalized to primers 
for ribosomal protein, large, PO (RPLPO) as follows: forward 5’-TCCC 
ACTTGCTGAAAAGGTCA-3’; reverse 5’°-CCGACTCTTCCTTGGCTTCA-3’. 


Tyrosine kinometree visualization 

Tyrosine kinase phylogeny data were downloaded from kinase.com®. 
Phylogenetic tree was visualized using the R package ggtree”’. Expres- 
sion data of each tyrosine kinase were averaged across the SDH-deficient 
GISTs, and then plotted on the tree, with the area of the red circles cor- 
responding to the average TPM value in SDH-deficient GISTs. 


Interrogation of public normal tissue, GIST and ICC expression 
data 

Data for normal tissue expression was obtained from ENCODE®. 
Mouse interstitial cell of Cajal expression data were previously pro- 
cessed and published”. Raw Affymetrix microarray data (CEL files) 
of human ICC and GIST samples were downloaded from GEO, under 
accessions GSE56670™, GSE77839®, GSE17743° and GSE20708°. CEL 
files were imported, normalized, and RMA values exported using the 
R/Bioconductor package affy®. 


Flowcytometry enrichment and analysis of ICCs 

Fresh benign stomach muscle tissue was obtained from the MGH 
Pathology Tissue Bank and dissected from the gastric epithelium. Tis- 
sue was initially manually mechanically dissociated witha sterile scal- 
pel, and then subjected to fine mechanical dissociation through three 
cycles of 1 min each ina Miltenyi gentleMACS dissociator, resulting in 
a single-cell suspension. A small portion was removed from the cell 
suspension to serve as the unlabelled and unpermeabilized control 
to set size gates and test viability. The remainder of the cell suspen- 
sion was then incubated in a permeabilizing flow cytometry buffer and 
stained with ANOI-Alexa488 (Santa Cruz Biotechnology clone C-5), 
KIT-PE (Biolegend Clone 104D2) or CD45-APC (BD Biosciences clone 
2D1) for 30 min at 4 °C. Non-permeabilized control cells were treated 
with propidium iodide immediately before analysis. Stained cells were 
analysed and collected on a Sony SH800S cell sorter. Compensation 
parameters were determined using single-labelled UltraComp eBeads 
(ThermoFisher). Approximately 1.5 million cells were sorted, of which 


2,000 werecollected as CD45 KIT*ANOI' (ICC enriched). Cells were lysed 
inasmall volume of TAE/DTT and treated with Proteinase K. Genomic 
DNA in the lysed cell mixture was then bisulfite converted using the EZ 
DNA Methylation Gold Kit (Zymo), subjected to locus PCR, and then 
sequenced on an Illumina NextSeq500. 


PDX model generation and efficacy studies 
The PDX model was generated from surgical resection tissue from an 
SDH-deficient GIST patient who consented to research use of material 
under an IRB-approved protocol. The surgical sample was implanted 
subcutaneously in female NSG mice and allowed to grow. Tumour 
growth was monitored by caliper measurements. Once tumours grew 
to asize of 1.000 mm’, tumours were isolated and cut into pieces of 
approximately 3 x 3 x 3mm, dipped in Matrigel (Corning Life Science) 
and transplanted subcutaneously in additional NSG mice. Tumours 
were passaged for no more than 10 times. Tumour samples from all 
passages were banked by viably freezing in Bambanker freezing media 
(Fisher Scientific) and used for further studies. For efficacy studies, 
tumour fragments were implanted into 8-week-old NSG mice. Tumours 
were allowed to establish to 192 + 35.7 mm? in size before randomiza- 
tion into various treatment groups with n= 8/group as: vehicle control 
(0.1M citrate buffer, pH 4.5),40 mg kg“ sunitinib (LC Laboratories, 0.1M 
citrate buffer, pH 4.5), 20 mg kg? BGJ-398 (LC Laboratories, acetate 
buffer, pH 4.6 and PEG300 in 1:1 ratio) or the combination of BGJ-398 
and sunitinib. Mice were treated orally once daily for 28 days with these 
agents. Inthe BGJ-398 treatment group, 4 of 8 mice, and inthe combina- 
tion treatment group, 7 of 8 mice, lost >15% body weight requiring drug 
holidays (1-3 days of drug holidays in the single agent BGJ-398 group 
and 1-15 days of drug holidays in the combination group). Mice were 
re-started on treatment once body weight recovered to at least >85% of 
initial body weight. Treatment groups were censored when the tumour 
volume reached the maximum permissible size of 2,000 mm’ in any sin- 
gle mouse in that group. Statistics were determined by two-way ANOVA. 
All relevant ethical regulations regarding research in animal mod- 
els were followed. All animal experiments and study protocols were 
approved by the Dana Farber Cancer Institute Institutional Animal Care 
and Usage Committee (IACUC). The endpoint criteria for mice were if 
the total tumour burden/size reaches 2 cm in any direction or tumour 
volume exceeds 2,000 mm, and/or if the tumour mass interferes with 
basic/vital bodily functions or becomes persistently ulcerated, and 
these criteria were followed for all mice in the study. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


Sequencing data that support the findings of this study have been depos- 
ited in GEO with the accession code GSE107447. 
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Extended Data Fig. 1| Epigenomic characterization of GIST. a, ChIP-seq 
profiles for H3K27ac were compared for GIST specimens and other 
gastrointestinal tract tumour specimens (GAC, gastric adenocarcinoma; CRC, 
colorectal cancer; PDAC, pancreatic ductal adenocarcinoma). Heat map depicts 
pairwise Pearson correlations between the top 10,000 most variable peaks 
(yellow indicates high correlation; blue indicates low correlation). The 
dendrogram (left) was derived by unweighted average distance linkage. 
Enhancer patterns are relatively consistent across GIST subtypes, compared to 
other tumour types. b, DNA methylation levels in the vicinity of CTCF sites were 
profiled genome-wide by hybrid-selection bisulfite sequencing. CTCF sites are 
binned according to the amount their methylation increased in SDH-deficient 
GISTs, relative to SDH-intact GISTs (methylation change computed over a250 bp 
windowcentred onthe motif). For each bin, bar graphs depict the percentage of 
sites that lose CTCF binding in SDH-deficient GISTs, per ChIP-seq. Separate 
plots are shown for CTCF sites for which motifs do or do not containa CpG. 


Increased methylation over CTCF sites is associated with more frequent loss of 
CTCF binding, even when the CTCF motif lacks a CpG.c, Plot depicts correlation 
between CTCF occupancy and DNA methylation in SDH-deficient GISTs. Red 
points show Spearman correlations between CTCF ChIP-seq signal and 
methylation of CpGs at indicated positions relative to the centre of the CTCF 
motif. Red line reflects correlation to average methylation over 10 bp windows. 
Randomly permuted data (black) are shown for comparison. Anti-correlation 
between CTCF occupancy and methylation is evident over a -250-bp binding 
footprint. d, Genomic views of a representative 10 Mb region on chromosome 
21 depict chromosome topology (HiC, red), CTCF binding (ChIP-seq, orange) 
and CTCF-CTCF loop interactions (HiChIP, black) for the SDH-intact GIST 
model, GIST-T1. TADs are visible as triangles of enhanced interaction in HiC data, 
flanked by boundaries that correspond to loop interactions in HiChIP data. 
Genes (blue) are also indicated. 
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Extended Data Fig. 2|See next page for caption. 
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Extended Data Fig. 2| Super-enhancers and insulators in GIST. a, Traces 
depict H3K27ac ChIP-seq signal for normal stomach muscle (NSM) and GISTs 
of the indicated subtype over the FGF-ANO1 locus. b, Traces depict H3K27ac 
ChIP-seq signal for NSM and GISTs of the indicated subtype over the PDGFRA- 
KITlocus. Genes are indicated in blue, and super-enhancer locations are 
indicated by green bars. Fora, b, traces are representative of 11K/7-mutant and 6 
SDH-deficient tumours with similar results. c, Traces depict CTCF binding over 
the FGF insulator in normal stomach muscle (NSM) and GIST clinical specimens. 
d, Plot depicts CTCF ChIP-seq signal over the strongest CTCF peak in the FGF 


insulator innormal stomach muscle (NSM, n=4), and K/7T mutant (n=11), PDGFRA 
mutant (n= 2) and SDH-deficient GISTs (n= 6). e, Traces depict CTCF binding 
over the KIT insulator in normal stomach muscle (NSM) and GIST clinical 
specimens. f, Plot depicts CTCF ChIP-seq signal over the strongest CTCF peakin 
the KIT insulator in normal stomach muscle (NSM, n=4), and K/7 mutant (n=11), 
PDGFRA mutant (n= 2) and SDH-deficient GISTs (n= 6). For d and f, horizontal 
bars reflect mean values and P values indicate significance of CTCF loss in SDH- 
deficient GIST, as determined by the Walt test (via DEseq2™). Alln values 
represent the number of biologically independent clinical specimens. 
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Extended Data Fig. 3 | FGF locus 4C-seq data and insulator deletion. a, Traces 
depict 4C-seq data at FGF locus, asin Fig. 2b, except graphed onthesame axis to 
allow for direct comparison. b,c, Bar plots quantify 4C-seq interactions between 
the super-enhancer viewpoint and FGF4 (b) or FGF3 (c). Expression of these 
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genes inthe corresponding SDH-deficient GIST specimens is also shown. 

d, Traces depict CTCF ChIP-seq signal in GIST-T1 cells infected with CRISPR- 
Cas9 and either acontrol sgRNA directed at GFP (black, top) orsgRNAs directed 
against the two indicated CTCF motifs in the FGF insulator (second row, red). 
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Extended Data Fig. 4| See next page for caption. 
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Extended Data Fig. 4 | Allelic imbalance in FGF3 and FGF4 activation. 

a, Two heterozygous SNPs in FGF4 (both 3’ UTR) enabled us to evaluate allelic 
expression in three SDH-deficient GISTs (tumours S6, Sland S4). Bothalleles for 
each SNP were detected in DNA sequencing data for these tumours, but only one 
allele was detected in RNA-seq data of tumours S1 and S6, indicative of mono- 
allelic FGF4 expression. Both alleles are detected in tumour S4, indicating 
bi-allelic expression of FGF4.b, Heterozygous SNPs in FGF3 exons (both 
synonymous base substitutions) enabled us to evaluate allelic expression inthe 
SDH-deficient GISTs (tumours S2 and S5). In both cases, DNA sequencing 
confirmed heterozygosity at the genome level (C/A and T/C, respectively), but 
RNA-seq data demonstrated mono-allelic FGF3 expression. c, Bothalleles of 
heterozygous SNPs in ANOJ exons were found in the RNA-seq data derived from 
SDH-deficient GIST samples, confirming bi-allelic expression of ANOI. Similarly, 
both alleles of heterozygous SNPs were found in the histone H3K27ac ChIP-seq 
data, confirming the bi-allelic nature of the super-enhancer (not shown). d, One 
SDH-deficient GIST sample was heterozygous for a SNP (rs386829467) located 
about 50 bp from the CTCF motif of Peak 2 in the FGF insulator. Allele-agnostic 
methylation data confirmed 43% methylation of the CTCF peak in this tumour, 
while essentially no methylation was detected inthe SDH-intact tumours (left). 
Separation of the twoalleles using the heterozygous SNP revealed strong allelic 


bias in the SDH-deficient tumour: one allele was largely unmethylated (-3% 
methylation), while the other was highly methylated (-75% methylation), 
consistent with mono-allelic methylation of the CTCF site (right). e, Schematic 
depicts 4C-seq experimental protocol and primer design for detecting SNPs. 
DNA elements in close physical proximity are crosslinked and restricted withan 
enzyme that leaves nucleotide overhangs. These overhangs are then proximity 
ligated to crosslinked fragments. A second restriction enzyme (with different 
restriction sites) is then used to circularize the ligated fragments, allowing for 
inverse PCR. Here we selected restriction enzymes and designed acustomread 2 
primer to capture a heterozygous SNP within the super-enhancer. This second 
read is normally non-informative as contact frequencies are determined 
through the viewpoint primer (read 1), but in this case enabled us to detect the 
SNP andassign each ligated fragment toa specific allele. f, The left trace (grey) 
depicts standard 4C-seq data (allele agnostic), which demonstrates strong 
interaction between super-enhancer viewpoint and FGF4. However, the SNP 
covered in the non-viewpoint read enabled us to distinguish interactions 
involving the minor (top right) or major (bottom right) allele. This revealed that 
the major allele (purple) is responsible for -97% of super-enhancer-FGF4 
interactions. 
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Extended Data Fig. 5 | KITlocus 4C-seq data and insulator deletion. a, Traces or the KIT gene itself (d). Pvalues indicate significance of difference between 
depict K/T locus 4C-seq data, as in Fig. 3b, except graphed onthe same axis to SDH-intact and SDH-deficient, by two-sided t-test. e, Traces depict CTCF ChIP- 
allow for direct comparison. b-d, Bar plots quantify 4C-seq interactions (top, seq signal in GIST-T1 cells infected with Cas9 and either acontrol sgRNA directed 
reproduced from Fig. 3b) between the super-enhancer viewpointand positions at GFP (black, top) orsgRNAs directed against the two bound CTCF motifsinthe 
within the super-enhancer TAD (b), sequences just beyond the KIT insulator (c), KIT insulator (second row, red). 
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Extended Data Fig. 6 | Hypoxia marker induction in GIST-T1 cells. a, Bar plot 
depicts levels of the TET product, 5-hydroxymethyl-cytosine (5-hmC), measured 
by ELISA in GIST-T1 cells infected with CRISPR-Cas9 and either ashort guide RNA 
targeting GFP (sgGFP) or SDHB. Cells were cultured in either control media or 
media supplemented with 20 uM of dimethylsuccinate (DMS), amembrane- 
permeable ester of succinate, for 3 days, as indicated. b, c, Plots show relative 
expression of pseudo-hypoxia-associated genes EPAS1 (also knownas HIF2A)” 
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and /GF1R® (b), and KITLG (also knownas SCF)°® (c) incontrol GIST-T1 cells 
(black), SDH-deficient GIST-T1 cells generated by CRISPR-Cas9 knockout of 
SDHB and cultured with exogenous succinate (red), or GIST-T1cells treated with 
theiron chelator DFX to simulate hypoxia (blue). Upregulation of KIT ligand due 
to pseudo-hypoxia or tumour hypoxia may supplement FGF ligands in 
promoting RTK signalling in SDH-deficient GIST. 
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Extended Data Fig. 7 | FGF and KIT insulator methylation and expression in 
GIST subtypes and non-malignant cells. a, Bar plot depicts methylation of FGF 
insulator CTCF peak 2 (top) and KIT insulator CTCF peak 2 (bottom) in 34 tissues 
and primary cells available through ENCODE”. Values are average methylation 
of CpGs nearest the CTCF motifs, determined by whole genome bisulfite 
sequencing (WGBS) (KIT insulator, 3 CpGs; FGF insulator, 6 CpGs). Methylation 
of these sites is also shown for SDH-intact and SDH-deficient GISTs (see 

Fig. 3e, f), and for flow-sorted CD45 ANOI‘KIT' (ICC enriched) cells from normal 
stomach muscle (NSM) tissue (n values represent biologically independent 


specimens). b, Left, table depicts FPKM (fragments per kilobase of transcript per 
million mapped reads) values of relevant genes in mouse ICCs isolated from 
jejunum or colon”. Right, dot plot depicts expression of FGF4 in either flow 
sorted ICCs (green) or GISTs of the indicated subtype: K/7 mutant in black, 
PDGFRA mutantin blue, SDH-deficient in red, and KIT-/PDGFRA wild typein 
purple® °°. SDH status of the latter group is unknown, but SDH-deficient GIST 
representa significant portion of KIT-/PDGFRA wild-type tumours. Data are 
drawn from the indicated GEO series and publications. 
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Extended Data Fig. 8 | PDX trial of FGFR and KIT combination therapy in 
SDH-deficient GIST. a, b, Genomic views depict RNA expression (black), 
H3K27ac (green) and CTCF occupancy (orange) over the FGF (a) and KIT (b) loci 
for the S1 primary tumour and PDX. Genes (blue), super-enhancers (green bar 
and shade) and lost CTCF insulators (orange shade) are indicated.c, Plot depicts 
tumour volume during treatment and observation periods of experiment, asin 
Fig. 4g, except with time axis extended until final group reached censor point 


(one tumour in the group >2,000 mm’). Points represent mean tumour size, 
error bars represent s.e.m., and shading represents range of tumour sizes for 

n=8 biologically independent xenograft-bearing mice per group. For statistics, 
see Fig.4g.d, Kaplan-Meier plot depicts survival until clinical endpoint (tumour 
size >2,000 mm’) for the same PDX trial. Median and range survival are indicated 
for each group. Pvalues reflect difference in survival between groups as 
calculated by logrank test. 
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Reporting Summary 


Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency 
in reporting. For further information on Nature Research policies, seeAuthors & Referees and theEditorial Policy Checklist . 


Statistics 


For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section. 


n/a | Confirmed 


x| The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


x| A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


x A description of all covariates tested 


x| A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 


: AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 
7 For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
Give P values as exact values whenever suitable. 
x For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 
x For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 
x Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Our web collection on statistics for biologists contains articles on many of the points above. 


Software and code 


Policy information about availability of computer code 


Data collection No software was used specifically for data collection; see below for data analysis software. 


Data analysis Software usage and parameters are detailed in methods section of manuscript. Briefly, sequencing reads were aligned with BWA v. 0.7.4, 
Methylation data analyzed with methylCtools v. 0.9.4. ChIP-seq peaks were called and analyzed via Homer v. 4.9, MEME v. 4.7, HTSeq v. 
0.6.152 and DESeq?2 v. 1.16.1, general data analysis and graphing was performed in Matlab v. 9.1.0.441655, R v. 3.5.3, and IGV v. 2.5.3. 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. 
We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 


Data 


Policy information about availability of data 
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 


- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- A description of any restrictions on data availability 


Sequencing data that support the findings of this study have been deposited in GEO with the accession code GSE107447. 


Raw sequencing data generated through this project may contain identifiable human genetic information, as such it requires IRB approval to access (data has been 
deposited into a dbGaP dataset connected to the GEO database). 


Raw data for the mouse xenograft trial (i.e. measured tumor volumes) are available as supplementary information table 3. 
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Field-specific reporting 


Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection. 


X | Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences 


For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf 


Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size For analysis of clinical tissue, no statistical methods were used to predetermine sample size; rather, all available SDH-deficient tumor 
specimens with validated SDH loss and enough material available for analysis were tested. For mouse studies, no specific statistical 
calculations were performed; rather, sample size was determined based on prior experience with similar PDX trials. 
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Data exclusions No data were excluded from analyses. 


Replication For all CRISPR insulator deletions, viral introduction of CRISPR/Cas9+sgRNA vector was repeated three times with separate viral preparations 
and infections to generate biologically independent replicates. 
For epigenomic and transcriptomic characterization of clinical tissue (e.g. ChIP-seq, 4C, RNA-seq), multiple clinical specimens were analyzed, 
but technical replicates could not be performed on individual samples due to limited availability of material. All biological replicates (i.e. 
tumors of a given driver) were similar at reported sites (i.e. insulator loss/enhancer presence was consistent within driver subgroups). 


Randomization — For PDX trial, xenograft-bearing mice were randomized to treatment group at ~200mm3 tumor volume, with 8 mice per treatment group. 


Blinding For PDX trial, blinding was not possible due to preparation and delivery methods of tested drugs. This is thought to have minimal impact on 
the studies, as no subjective (e.g. behavioral) criteria were measured. 


Reporting for specific materials, systems and methods 


We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, 
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 


Materials & experimental systems Methods 
n/a | Involved in the study n/a | Involved in the study 
x | Antibodies x | ChIP-seq 
X| Eukaryotic cell lines x Flow cytometry 
x Palaeontology x MRI-based neuroimaging 


x| Animals and other organisms 


x | Human research participants 


x Clinical data 


Antibodies 


Antibodies used CTCF antibody is from Cell Signaling technologies, clone D31H2, catalog number 3418. 
Antibody was validated by manufacturer for ChIP in human cells, has been previously utilized 
by the authors (Flavahan et al., Nature 2016), and was validated as part of the ENCODE 
project via Western Blot for CTCF, as well as motif analysis to confirm enrichment for the 
known CTCF motif. 


H3K27ac is a rabbit polyclonal antibody available from Active Motif, catalog number 39133, 
lot 31814008. Antibody was validated by the manufacturer for ChIP in human cells, and was 
validated as part of the ENCODE project. 


Validation Both antibodies were validated by manufacturers (including statements on the websites), and by the investigators and 
colleagues as part of the ENCODE project - including, but not limited to, western blot, immunoprecipitation, and ChIP motif 
finding and analysis of control cell lines and known peaks/motifs. 


Eukaryotic cell lines 


Policy information about cell lines 


Cell line source(s) One cell line, GIST-T1, was used, and was obtained from the commercial vendor Cosmo 
Biosciences. 
Authentication The cell line was not authenticated via STR testing, however locus sequencing for the KIT 


gene confirmed the presence of the known and published KIT mutation present in GIST-T1 
cells and the parental tumor the cell line was derived from. 


Mycoplasma contamination The cell line was tested for mycoplasma via PCR-based method and confirmed to be 
mycoplasma-free. 


Commonly misidentified lines No commonly misidentified cell lines were utilized. 
(See ICLAC register) 


Sj 
fev) 
a 
‘= 
= 
o 
= 
o 
Nn 
oO) 
fev) 
= 
a) 
=r 
= 
o 
10) 
Oo 
= 
5 
a 
Wn 
e 
S 
5} 
fev) 
5 
S 


Animals and other organisms 


Policy information about studies involving animals; ARRIVE guidelines recommended for reporting animal research 


Laboratory animals For the PDX trial, female 8-week old NSG mice were utilized. 

Wild animals No wild animals were utilized in this study. 

Field-collected samples No field-collected samples were utilized in this study. 

Ethics oversight All animal experiments and study protocols were approved by the Dana Farber Cancer Institute Institutional Animal Care and 


Usage Committee (IACUC), and this is noted in the manuscript. 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 


Human research participants 


Policy information about studies involving human research participants 


Population characteristics The study involved the collection of deidentified and anonymized tumor material from patients at either Brigham and Women's 
Hospital, The Dana Farber Cancer Institute, or Massachusetts General Hospital. As such, no information about the patients, 
other than disease pathology, is known. 


Recruitment Patient tissue was obtained from tissue banks at either MGH or DFCI. Standard of care for GIST includes surgical resection of the 
tumor bulk - following treatment, excess surgical material from consenting patients was deposited into these banks. As the 


collection of these tissues occurs as part of the normal disease treatment, it is unlikely there is significant self-selection bias. 


Ethics oversight The study protocol was approved by the Massachusetts General Hospital IRB and the Dana Farber Cancer Institute IRB. 


Note that full information on the approval of the study protocol must also be provided in the manuscript. 


ChIP-seq 


Data deposition 


x | Confirm that both raw and final processed data have been deposited in a public database such as GEO. 


x | Confirm that you have deposited or provided access to graph files (e.g. BED files) for the called peaks. 


Data access links GEO database accession GSE107447 

May remain private before publication. _ https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE107447 

Files in database submission See included file, ChIP-seq_2017-12-16052D_file_info.xlsx 

Genome browser session no longer applicable 

e.g. UCSC) [S) 
Methodology 8 

No 
Replicates For epigenomic and transcriptomic characterization of clinical tissue (e.g. ChIP-seq, 4C, RNA-seq), multiple clinical specimens S 


were analyzed, but technical replicates could not be performed on individual samples due to limited availability of material. 
All clinical samples within a driver subgroup (SDH-intact vs. SDH-deficient) were highly similar at tested locations (e.g. KIT/ 
FGF insulator loss and superenhancer presence). See extended data figure 2 for more info. 


Sequencing depth See included file, ChlIP-seq_2017-12-16052D_file_info.xlsx, which includes sequencing depth for each experiment. 


Antibodies CTCF antibody is from Cell Signaling technologies, clone D31H2, catalog number 3418. 
Antibody was validated by manufacturer for ChIP in human cells, has been previously utilized 
by the authors (Flavahan et al., Nature 2016), and was validated as part of the ENCODE 
project via Western Blot for CTCF, as well as motif analysis to confirm enrichment for the 
known CTCF motif. 


H3K27ac is a rabbit polyclonal antibody available from Active Motif, catalog number 39133, 
lot 31814008. Antibody was validated by the manufacturer for ChIP in human cells, and was 
validated as part of the ENCODE project. 


Peak calling parameters Peaks were called with HOMER 4.9 against input controls. To call all H3K27ac peaks, we used ‘histone’ settings. To call super- 
enhancers, we used ‘super’ settings and no local filtering. CTCF peaks were called with ‘factor’ settings. 


Data quality All reported peaks are detected with FDR < 0.1% and fold change > 4 
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Software Homer 4.9, DESeq2 1.16.1, FIMO/MEME 4.7, HTSeq 0.6.1, featureCounts 1.6.2, and CNVnator 0.3 
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The Fanconi anaemia (FA) pathway repairs DNA damage caused by endogenous and 
chemotherapy-induced DNA crosslinks, and responds to replication stress'”. Genetic 
inactivation of this pathway by mutation of genes encoding FA complementation 
group (FANC) proteins impairs development, prevents blood production and 
promotes cancer’. The key molecular step in the FA pathway is the 
monoubiquitination of a pseudosymmetric heterodimer of FANCD2-FANCI** by the 
FA core complex—a megadalton multiprotein E3 ubiquitin ligase°’. Monoubiquitinated 
FANCD2 then recruits additional protein factors to remove the DNA crosslink or to 
stabilize the stalled replication fork. A molecular structure of the FA core complex 
would explain how it acts to maintain genome stability. Here we reconstituted an 
active, recombinant FA core complex, and used cryo-electron microscopy and mass 
spectrometry to determine its structure. The FA core complex comprises two central 
dimers of the FANCB and FA-associated protein of 100 kDa (FAAP100) subunits, flanked 
by two copies of the RING finger subunit, FANCL. These two heterotrimers act as a 
scaffold to assemble the remaining five subunits, resulting in an extended asymmetric 
structure. Destabilization of the scaffold would disrupt the entire complex, resulting in 
anon-functional FA pathway. Thus, the structure provides a mechanistic basis for the 
low numbers of patients with mutations in FANCB, FANCL and FAAP100. Despite a lack 
of sequence homology, FANCB and FAAP100 adopt similar structures. The two FANCL 


subunits are in different conformations at opposite ends of the complex, suggesting 
that each FANCL has a distinct role. This structural and functional asymmetry of 
dimeric RING finger domains may bea general feature of E3 ligases. The cryo-electron 
microscopy structure of the FA core complex provides a foundation for a detailed 
understanding of its E3 ubiquitin ligase activity and DNA interstrand crosslink repair. 


The FA core complex is composed of eight stably-associated subunits: 
FANCA, FANCB, FANCC, FANCE, FANCF, FANCG, FANCL and FAAP100°. 
FANCL contains a RING-finger domain which acts as the E3 ubiqui- 
tin ligase. It associates with FANCB and FAAP100 to form a catalytic 
module®*; a low-resolution negative-stain electron microscopy (EM) 
study suggested that, inthe absence of the other subunits, this is asym- 
metric dimer of FANCB-FANCL-FAAP100 heterotrimers®. FANCA and 
FANCG are proposed to act as achromatin-targeting module, whereas 
FANCC, FANCE and FANCF forma substrate-recognition module®*?°", 
Despite the central role of the FA core complex in DNA repair, we lack a 
molecular understanding of how FANCL incorporates into the complex 
to perform site-specific monoubiquitination of the FANCD2-FANCI 
substrate and how mutation disrupts the function of the complex”. 
To determine the structure of the FA core complex, we overexpressed 
all eight subunits from Gallus gallus (chicken) ona single baculovirus 


ininsect cells, which enabled us to purify an intact, recombinant com- 
plex (Fig. 1a). The purified complex specifically monoubiquitinated 
FANCD2 but not FANCI in vitro (Extended Data Fig. 1), similar to the 
native chicken complex®. 

To investigate the molecular basis of subunit association, we 
imaged this recombinant FA core complex using cryo-EM. This 
revealed an elongated structure, about 25 nm in length (Fig. 1b, 
Extended Data Fig. 2a). We determined a 3D reconstruction of the 
FA core complex at an overall resolution of 4.2 A (Extended Data 
Fig. 2b-e, Extended Data Table 1). The peripheral regions were less 
well resolved than the central core, possibly owing to conforma- 
tional flexibility. In agreement with conformational heterogeneity 
inthe complex, we detected structural variations using multi-body 
refinement”, including a continuum of conformational movement 
of the top and base regions (Extended Data Fig. 2f, Supplementary 
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Fig. 1| Overall structure of the FA core complex. a, SDS-PAGE analysis of 
purified FA core complex with subunits and molecular weight markers 
indicated. FANCC carries a 2x Strep II tag on its C terminus (FANCC-SII). This 
purification was repeated more than three times with similar results. For gel 
source data, see Supplementary Fig. 1.b, Selected 2D reference-free class 
averages of the FA core complex. One class appears to be symmetric (labelled). 
Asterisks mark disordered density extending from the side of the complex that 
does notalign well. c, Focused classification and refinement on the top and base 
regions, and multibody refinement on the middle region resulted in three 
independent cryo-EM maps that are shown separately, in three different shades 
of grey. d, Crosslinking mass spectrometry revealed 834 crosslinks (1% false 
discovery rate) between residues that are in close proximity. Intermolecular 
crosslinks are shown, coloured by interacting regions. e, Model of FAcore 
complex (cartoon representation) fitted into the EM density (isosurface 
representation with transparency). Map and modelare coloured by assigned 
subunits. The green star marks a channel with diameter approximately 23 A. 


Videos 1, 2). Particle subtraction followed by focused classification 
and refinement” generated separate, improved reconstructions 
of the top and base regions (Fig. 1c, Extended Data Fig. 2g, h and 
Supplementary Video 3). 

Our map of the complete FA core complex was of sufficient resolution 
to dock existing structures and to resolve secondary structure elements 
(Extended Data Fig. 2i). We fittwo previously determined high-resolution 
structures (FANCL” and part of FANCF”*) into the map, accounting for 
about 12% of the entire mass of the complex. Most subunits do not have 
substantial homology to proteins of known structure (Extended Data 
Table 2), so we modelled a-helices and B-strands into the remainder of 
the map. To determine which subunits these a-helices and B-strands 
belong to, we required additional data. 


Next, we purified subcomplexes and imaged them using cryo-EM. By 
comparing the 2D class averages of subcomplexes to the complete FA 
core complex, we identified regions corresponding to specific subu- 
nits (Extended Data Fig. 3a—c). Removal of FANCA did not substantially 
change the class averages. This suggested that FANCA may be confor- 
mationally heterogeneous and blurred outin reconstructions, or it may 
dissociate or denature during cryo-EM specimen preparation. The base 
was absent inacomplex of FANCA, FANCG, FANCB, FANCL and FAAP100, 
suggesting that the base probably contains the substrate-recognition 
module (FANCC, FANCE and FANCF). The partially disordered arm that 
extends fromthe central part of the complex is probably FANCG because 
this density is lost when FANCG is removed from a FANCB-FANCL- 
FAAP100 complex. Finally, 2D classes of the catalytic module (FANCB, 
FANCLand FAAP100) resemble the middle region of the FA core complex. 

We also studied the structure of the FA core complex using non- 
covalent native mass spectrometry (Extended Data Fig. 3d, e). During 
ionization, the FANCE subunit tends to dissociate and subcomplexes are 
formed, providing information on subunit stoichiometry and protein- 
protein interactions. The largest complex (808 kDa) that we detected 
in native mass spectrometry contains seven of the eight different subu- 
nits, including two copies of FANCB, FANCL and FAAP100, and asingle 
copy of each of the remaining subunits. This agrees with the subunit 
stoichiometries of a purified native FA core complex°. FANCB, FANCL 
and FAAP100 were present in most of the subcomplexes identified by 
native mass spectrometry (Extended Data Fig. 3d), suggesting that they 
form acentral core. 

To identify residues in close proximity, we performed crosslinking 
mass spectrometry (Fig. 1d, Extended Data Fig. 4). This revealed 834 
inter-and intramolecular crosslinks, with 40% of these located in FANCB, 
FANCLand FAAP100. Thisis consistent with these three subunits forming 
anintimate complex. By combining the crosslinking mass spectrometry 
data showing which residues are in close proximity with the subunit 
assignment from subcomplexes, the subunit stoichiometry from native 
mass spectrometry, homology modelling and secondary structure pre- 
dictions, we generated models for all FA core complex subunits except 
FANCA (Fig. le, Extended Data Fig. 5a—d, Extended Data Table 1, 2, Sup- 
plementary Video 4, Methods). 

A dimer of FANCB-FAAP100 heterodimers is located in the middle 
region of the structure (Fig. 2). Two pairs of long a-helices (coiled coils) 
connect central §-strands and helical bundles with peripheral densities 
(Fig. 2b, c). Crosslinking mass spectrometry and modelling showed that 
each coiled coil is probably composed of a-helices from FANCB and 
FAAP100 (Extended Data Fig. 5e, f). Atthe peripheral ends of the coiled 
coils, we identified pairs of 8-propellers, each containing a B-propeller 
from the N-terminal region of FANCB or FAAP100 (Fig. 2d). We could 
differentiate the two B-propellers on the basis of crosslinks: the FANCB 
B-propeller is near the coiled coil, whereas the FAAP100 B-propeller 
is close to the ELF domain of FANCL. Unexpectedly, despite the lack 
of sequence homology, FANCB and FAAP100 share markedly similar 
overall structures and domain organizations (Fig. 2e). 

Ahomology model of FANCL, including the ELF, URD and RING finger 
domains", fits into the base of the complex (Fig. 3a, b) but the relative 
orientations of the individual domains are different compared with 
the crystal structure (Extended Data Fig. 4b). By contrast, only the ELF 
domain could be placed into the second copy of FANCL at the top of the 
complex (Fig. 3c). Hydrogen-deuterium exchange mass spectrometry 
(HDX-MS) confirmed that FANCL interacts with the FANCB-FAAP100 
coiled coil (Extended Data Fig. 6). 

FANCG contains tetratricopeptide repeats (TPRs) (Fig. 3d, e), 
crosslinks to the central FANCB-FAAP100 dimer (Fig. 1d), andis required 
for FANCA association with the FA core complex (Extended Data Fig. 3b). 
Of note, there is a channel between FANCG and the catalytic module 
(Fig. le). FANCA, which is absent in the maps, is probably peripheral to 
FANCG, and possibly located in the blurred density visible in 2D class 
averages (indicated by asterisks in Fig. 1b; Extended Data Fig. 5d). 
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The large number of crosslinks between FANCA and FANCG are consist- 
ent with their proximity (Fig. 1d). 

The substrate-recognition module (FANCC, FANCE and FANCF), 
located in the base, comprises an arc of a-helices (Fig. 3f). FANCF occu- 
pies acentral position within the arc. Crosslinking mass spectrometry 
showed that the FANCL RING finger and ELF domain contact FANCE 
and FANCF, the FANCL URD domain contacts FANCC and FANCE, and 
the FANCC C-terminal region and FANCE are in close proximity. Native 
mass spectrometry also showed direct contact between FANCE and 
FANCF (Extended Data Fig. 3d). 
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Fig. 2| The molecular scaffold of the FA core 
complex includes a dimer of FANCB-FAAP100 
heterodimers. a, Surface representation of the 
FA core complex model, highlighting FANCB 
and FAAP100. FANCB, orange; FAAP100, yellow; 
regions where we are unable to distinguish 

200 FANCB and FAAP100, yellow-orange. 
| b-d, Models of FANCB and FAAP100 subunits in 
cartoon representation placed into the cryo-EM 
map. Inb, ablack oval marks the pseudo two-fold 


B-propeller 


Es 400 symmetry axis. There are substantial differences 
° between thetwosymmetry-related copies, which 
$ are shownin different shades of yellow inthe 
8 model. Ind, cones indicate the orientations of 
iS 600 the central pores of the B-propellers and the 
5 angles between them are shown. e, Proposed 
domain organization of FANCB and FAAP100, 
38 showing their structural similarity. 
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We also determined a structure of a subcomplex, present at a lower 
abundance in our sample, at an overall resolution of 4.6 A (Extended 
Data Fig. 7a-d). This subcomplex was symmetric but the map did not 
improve on application of C2 symmetry. We therefore implemented a 
local symmetry algorithm in Relion for averaging the two halves of the 
subcomplex map (Methods, Extended Data Fig. 7e, f). This symmetric 
structure revealed an assembly comprising two copies of each of FANCB, 
FANCL, FAAP100 and FANCG (Extended Data Fig. 8a—c, Supplemen- 
tary Video 5). It is unclear whether this subcomplex has a functional 
role in vivo or whether it is an assembly intermediate. In both copies 


Fig. 3 | Asymmetric dimerization inthe FA core 
complex. a-—c, Models of FANCL inthe FA core 
complex. a, Surface representation of the FA core 
complex model, highlighting the two copies of 
FANCL.b,c, Models of FANCL,,s¢and FANCLio, 
subunits in cartoonrepresentation are shown fitted 
inthe cryo-EM map. Density for the URD and RING 
domains is not well defined in the top copy (c). 

d-f, Models of FANCG andthe substrate-recognition 
module. d, Surface representation of the FA core 
complex model highlighting FANCG and the 
substrate-recognition module (FANCC-FANCE- 
FANCF).e, f, Models of FANCG TPR domain (e) and 
the substrate-recognition module (f) are shown 
fitted into the cryo-EM map. The crystal structure 
of FANCF could be assigned and the Nand C termini 
of the model are indicated. The first helix (N to*) 
was not present in the crystal structure. Since all 
three subunits of the substrate-recognition 
moduleare substantially helical, it was not 
possible to assign the remaining helices of 

the base to individual subunits. g, Model for 
monoubiquitination of FANCD2 by the FA core 
complex. The major motions detected in multi- 
body refinement are indicated with grey arrows. 


RING domain 


of FANCL, the URD and RING finger domains have weak density or are 
not visible, similar to the top FANCL (FANCL,,,) in the full, asymmetric 
FA core complex. 

Comparison of subcomplex and FA core complex structures suggests 
that the substrate-recognition module alters the relative orientations 
of the B-propellers and coiled coil in the base (Fig. 2d, Supplementary 
Video 6). This disrupts the symmetry of the catalytic module in the com- 
plete FA core complex, provides a binding site in the base for the URD 
and RING finger domains of FANCL and probably disrupts docking of 
asecond (symmetric) copy of FANCG onto the middle region owing to 
steric clashes (Extended Data Fig. 8d). These structural alterations may 
be transmitted to the top region to prevent the binding of a second sub- 
strate-recognition module, consistent with allosteric coupling proposed 
previously’. In agreement with this, a purified substrate-recognition 
module did not readily associate with the symmetric subcomplexin vitro 
to form the asymmetric FA core complex (Extended Data Fig. 8e, f), 
suggesting that in vivo assembly is required. 

Dimerization is required for the activity of other E3 ligases, includ- 
ing the Rad18, RNF8 and CHIP homodimers” and the BRCAI-BARDI1 
and Ring1lb-Bmil heterodimers”°”. These contain two RING finger- 
U-box domains arranged in an asymmetric manner with only asingle 
functional E2 binding site. Notably, the activity of multi-subunit cul- 
lin-RING finger E3 ligases is stimulated by dimerization”””’. Thus, 
structural and functional asymmetry appear to bea common feature 
of E3 ligases but the spatial separation of the FANCL RING fingers in 
the FA core complex is unusual (Extended Data Fig. 9a, b). Like other 
dimeric RING finger E3s, one RING finger (in the FANCL subunit at the 
base (FANCL,,,.)) of the FA core complex may have a structural role in 
promoting substrate binding along with FANCE’°”*”>, whereas the 
other (FANCL,,,) may be the active E3 that promotes ubiquitin trans- 
fer’® to FANCD2 (Fig. 3g) within the structural state we observe. The 
FANCD2-FANCI substrate is also an asymmetric dimer. Thus, each of 
the two FANCL RING finger domains could monoubiquitinate one of 
the substrate proteins. Nevertheless, this purified complex does not 
monoubiquitinate FANCI, so an additional activation step might be 
required. Substrate binding is not required to activate the E3 ligase 
activity (Extended Data Fig. 9c, d). 

The majority of FA-complex mutations detected in patients with FA 
result in protein truncation and are found inthe structural periphery of 
the FA core complex” (Extended Data Fig. 9e). Residual monoubiquitin 
ligase activity is still present after deletion of peripheral subunits in 
cells®* and in vitro (Extended Data Fig. 9f), indicating a partially function- 
ing core complex. By contrast, deletion of FANCB, FANCL or FAAP100, 
which comprise the catalytic module and the structural scaffold for the 
complex, eliminates this residual activity®*®. Patients with FA who carry 
mutations in FANCB or FANCL are severely afflicted, and we predict that 
FANCB missense mutations (L43S, P230S, L329P and L676P) disrupt 
stability of the catalytic module (indicated by asterisks in Extended 
Data Fig. 4a). Together, these data provide genetic and clinical support 
that mutations in the catalytic module result in disruption of the FA 
core complex structure. By contrast, mutations in the periphery donot 
prevent complex formation and may be better tolerated. 

Insummary, our data provide a structural model for the FA core com- 
plex, enabling aninterpretation of the molecular pathophysiology of FA. 
The reconstituted system we describe will also enable further mecha- 
nistic questions to be addressed, including precisely how this large 
complex functions as aDNA damage-inducible monoubiquitin ligase. 
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Methods 


No statistical methods were used to predetermine sample size. The 
experiments were not randomized. The investigators were not blinded 
to allocation during experiments and outcome assessment. 


Cloning, expression and purification 
cDNAsencoding full length G. gallus (Gg) FANCA, FANCB, FANCC, FANCE, 
FANCF, FANCG, FANCL and FAAP100 were synthesized (GeneArt). 
FANCC contained a C-terminal extension with a 3C protease site and 
double Strep II tag. For protein expression, all genes were cloned into 
the MultiBac expression system and constructs were generated as 
previously described”®. In brief, to make gene expression cassettes, 
FANCA and FANCG were subcloned via BamHI and Xbal into pACEBacl; 
FANCC, FANCE, FANCF and FAAP100 were subcloned via Xhol and KpnI 
into pIDS; and FANCB and FANCL were subcloned via BamHI and Xbal 
into pIDC. Gene cassettes were then sequentially subcloned as BstXI- 
I-Ceul or BstXI-PI-Scel fragments into I-Ceul or Pl-Scel sites to generate 
pACEBacl-FANCA-FANCG, pIDS-FANCC-3C-2xStreplI-FANCE-FANCF, 
and pIDS-FAAP100-FANCB-FANCL. The spectinomycin antibiotic resist- 
ance cassette of pIDS-FAAP100-FANCB-FANCL was substituted with 
the kanamycin antibiotic resistance cassette of pIDK as a SnaBI-PI-Scel 
fragment to generate pIDK-FAAP100-FANCB-FANCL. 
pACEBacl1-FANCA-FANCG, pIDK-FAAP100-FANCB-FANCL and pIDS- 
FANCC-3C-2xStreplI-FANCE-FANCF were fused using Cre recombinase 
(NEB) to generate a single vector (gentamicin, spectinomycin and kana- 
mycin resistant) containing a single copy of each gene (FA core complex) 
used to make protein for the initial data collection. All constructs were 
confirmed by restriction digest analysis, PCR and sequencing. 

For subsequent protein preparations, A-G-B-L-100-C-E-F, A-G-B-L-100, 
G-B-L-100 and C-E-F complexes (letters indicate the FANC family mem- 
bers) were prepared using a modified BiGBac system as described previ- 
ously””*°. The individual genes were PCR amplified for cloning into pBIG 
vectors from pACEBacl, pIDC or pIDS vectors. Sequences encoding 
2xStrep Il tag and 3C protease site were included on FANCC. If FANCC 
was not present, the tag sequence was added to FANCB. The combined 
vector carrying the FA core complex or subcomplex was transformed 
into EMBacY cells to generate a bacmid. Bacmid DNA was transfected 
into Sf9 cells and virus was passaged twice in the same cell line before 
large-scale infection in Sf9 cells. Infected cells were collected when 
cell growth arrested. Sf9 cells were obtained from Oxford Expression 
Technologies, catalogue no. 600100 (negative for mycoplasma, identity 
not independently authenticated by us). 

Cells were lysed by sonication in lysis buffer (100 mM HEPES pH 8.0, 
300 mM NaCl, 1 mM TCEP, 5% glycerol, EDTA-free protease inhibitor, 
5mM benzamidine hydrochloride and 100 U mI" benzonase). Clarified 
cell lysate was incubated with StrepTactin resin (GE Healthcare) for1h 
followed by wash with lysis buffer. Proteins were eluted in elution buffer 
(100 mM HEPES pH 8.0, 300 mM NaCl, 1mM TCEP, 5% glycerol and 8 mM 
desthiobiotin). Further purification was performed by HiTrap Heparin 
HP affinity column (GE Healthcare) using alinear gradient of NaCl from 
concentration of 150 mMto1MinSOmMHEPES pH8.0, 1mM TCEP, over 
22 column volumes. For FA core complex, this was followed by anion- 
exchange chromatography (MonoQ, GE Healthcare) in the same buffer 
using alinear gradient of NaCl from concentration of 150 mMto1M over 
20 column volumes. The final buffer for purified FA core complex and 
subcomplexes was 50 mM HEPES pH 8.0, ~500 mM NaCl, 1mM TCEP. 


Ubiquitination assay 

Ubiquitination assays were performed as described previously**.. In 
brief, areaction volume of 20 pl contained 75 nM E1 (Boston Biochem), 
1M E2 (GgUBE2T), 0.25 LM E3 (FA core complex), 1 1M substrate 
(His-GgFANCI, GgFANCI58, His-GgFANCD2, GgFANCD2*"), 50 uM 
5’-flapped DNA and 20 uM haemagglutinin (HA)-ubiquitin (Boston 
Biochem). For ubiquitin discharge assays, concentrations of 125 nM 


E1 (Boston Biochem), 5 uM E2 (GgUBE2T), 1.5 uM E3 (FA core complex) 
and 50 mM free lysine (instead of substrate) were used. The 0.25 uM 
E3 enzyme concentration is based on the FANCL subunit, estimated by 
comparing the amount of FANCL in FA core complex and subcomplexes 
against purified FANCL of known concentration on SDS-PAGE. The 
reaction buffer was 50 mM HEPES pH 8.0, 64 mM NaCl, 4% glycerol, 
5mM MgCl,, 2 mM ATP and 0.5 mM DTT. The reaction was incubated 
at 30 °C for 90 min, stopped by adding NUPAGE LDS sample buffer 
(Thermo Fisher) and run onan SDS polyacrylamide gel (3-8% NuPAGE 
Tris-acetate). Samples were analysed by Coomassie staining or by west- 
ern blot using a HA antibody (Santa Cruz Biotechnology). All assays 
were performed independently three times. 


Electron microscopy 

Protein complexes were vitrified by applying 3-3.5 pl purified protein 
(-11M) to UltraAuFoil R1.2/1.3 grids (Quantifoil) with a thin continuous 
carbon support layer (for initial FA core complex dataset) or in unsup- 
ported ice (for final FA core complex dataset and subcomplexes) that 
had been made hydrophilic using an argon:oxygen plasma, blotting for 
2.5,3 or 4.5s at 4 °C with relative humidity of 100% and plunging into 
liquid ethane using a Vitrobot Mark IV (FEI). 

Cryo-EM data were collected ona FEI Titan Krios transmission elec- 
tron microscope operated at 300 keV acceleration voltage using EPU 
automated data collection software. An initial dataset was collected 
ona Falcon II detector at 47,000 nominal magnification with a pixel 
size of 1.774 A per pixel. The final data for FA core complex (Extended 
Data Table 1) were collected ona Falcon III detector in counting mode 
at 75,000x nominal magnification, and pixel size of 1.04 A (MRC LMB) 
or 1.085 A (eBIC). The subcomplexes G-B-L-100-C-E-F, A-G-B-L-100 and 
G-B-L-100 were imaged at 59,000x nominal magnification ona Falcon 
Ill detector in integrating mode. B-L-100 data were collected at 47,000x 
ona Falcon II detector in integrating mode. 


Cryo-EM image processing 

Image processing was performed in Relion (v.2 and v.3.0-beta ,and 
Relion wrappers were used for external programs except for EMAN2”°. 
For all datasets, whole-frame alignment was performed using Motion- 
Cor2”, and contrast transfer function parameters were estimated using 
gCTF°**. 3D maps were post-processed to automatically estimate and 
apply the B-factor and to determine the resolution by Fourier shell 
correlation (FSC) between two independent half datasets using 0.143 
criterion*’. Local resolution was estimated using ResMap“°. 
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Initial model 

The initial FA core complex dataset was processed in Relion v.2. Particles 
were picked manually froma few micrographs, and used for 2D class aver- 
aging witha box size of 390 pixels. The resulting 2D class averages were 
used to pick particles from all micrographs using template-matching 
inRelion's autopicker. After 2D classification of the auto-picked particles, 
selected classes were used to make an initial 3D model in EMAN2”. This 
initial model was used for subsequent 3D classification and refinement 
in Relion. 


Refinement 
The map generated from the Falcon II dataset was used during the first 
round of 3D classification with particles from the Falcon Ill dataset of 
FA core complex. The datasets from MRC LMB and eBIC were initially 
processed separately, to generate separate 3D reconstructions. The 
pixel size of eBIC data was determined using the 2.35 A spacing from 
the gold foil images collected under the same imaging conditions as the 
sample data. The eBIC micrographs were then rescaled to the pixel size 
of LMB data and CTF was re-estimated, and all datasets were merged. 
The FA core complex map was divided into three bodies: body 1 (mid- 
dle region), body 2 (base region) and body 3 (top region). Bodies 2and3 
were rotated relative to 1 because body 1 (middle) region appears to be 


the most rigid (Extended Data Fig. 2f). Sigma angles of 10 and asigma 
offset of 2 were used during refinement. Multibody refinement” was 
continued from the last iteration of consensus refinement. The middle 
region was best resolved with an overall resolution of 4.4 A whereas the 
top and base were -7 A resolution. To estimate the flexibility in the FA 
core complex we performed principal component analysis on the opti- 
mal orientations of all the bodies for all particle images in the dataset 
using relion_flex_analyse”. We rendered videos for principal compo- 
nents 1 and 2 as these described -30% of the variance in the rotations 
and translations. 

Per-particle CTF refinement and beam tilt estimation were performed 
by dividing the datasets into individual data collection sessions and 
processing in a bigger box of 586 pixels, followed by further refine- 
ment®. The resolution of the maps for the top and base regions of the 
complete complex were further improved by performing focused classi- 
fication with signal subtraction” followed by refinement (Extended Data 
Fig. 2g, h). The overall resolution is probably limited by heterogeneity. 
A regularization T-value of 5 was used during 3D refinement to boost 
the contribution of higher spatial frequencies”, to improve the quality 
of the map to aid in model building. 

The subcomplexes were processed up to 2D classification as there 
were not enough different views to generate a 3D map. 


Local symmetry averaging 

Cryo-EM structure determination by single-particle analysis relies onthe 
reduction of noise through averaging over multiple copies of extremely 
noisy projection images of individual macromolecular complexes. For 
symmetric complexes, for example, for homo-multimers or for icosahe- 
dral virus capsids, additional averaging can be performed by imposing 
point-group symmetry onthe reconstruction. Because each projection 
image of asymmetrical object provides multiple views of the asym- 
metric unit, compared to asymmetric complexes, the same number of 
images will yield a better reconstruction, or fewer images are needed 
to obtain a reconstruction of the same quality. Therefore, point-group 
symmetry averaging is commonly employed in single-particle analysis 
refinement programs. 

The FA subcomplex described here does not obey overall point-group 
symmetry, but still contains multiple, potentially identical, subcom- 
plexes. We call this local symmetry. Averaging over locally symmetric 
subcomplexes in cryo-EM single-particle analysis has previously been 
performed to improve reconstructions after completion of the refine- 
ment process, for example onthe subunits of triangulation number T>1 
virus capsids”?, However, local symmetry averaging has the potential 
to improve the reconstruction at every stage of the iterative refinement 
process. Better reconstructions during refinement will lead to better 
alignments, and hence abetter final reconstruction. Therefore, weimple- 
mented alocal symmetry averaging approach inside the relion_refine 
program. This approach has conceptual similarities to imposing non- 
crystallographic symmetry (NCS) in X-ray crystallography**. 

This newimplementation within Relion3.0 allows the user to define an 
arbitrary number of groups, each with an arbitrary number of assumed 
identical subcomplexes. For each group, the user provides a mask 
around one member of the group, as well as 3D transformation matrices 
(expressed as three Euler angles and three translations inx, yand z) in 
real space to superimpose that member onto each of the other members 
in the group. This information is expressed in a Relion-style STAR file, 
whichis passed using the local symmetry commandline option tothe 
relion_refine program. A helper program to find and optimize the 3D 
transformation matrices, relion_localsym, was also implemented. To 
minimize artefacts in Fourier space, the edges of the masks should be 
kept soft, that is, they should gradually change from zero to one over 
multiple real-space pixels. 

Our new implementation allows local symmetry averaging in both 
3D classifications and 3D refinements, and works with 2D projection 
images as well as 3D images, such as sub-tomograms. At every step of 


the expectation-maximization algorithm, local symmetry is applied 
according to the masks and transformations defined in the STAR file. 
This symmetrization is performed after the maximization step in real 
space. As such, the signal-to-noise gain that arises from the additional 
averaging is not considered when calculating the FSC between the two 
independent half-reconstructions in 3D auto-refinement, or when cal- 
culating the signal from the power of the reconstruction in 3D classifica- 
tion. Therefore, in cases of local symmetry, it may be advantageous to 
increase the empirical regularization T-value,-tau2_fudge, to account 
for the expected gain in signal during the refinement. When doing so, 
care should be taken not to overfit the data by using a T-value that is too 
high. Calculation of a final FSC curve using Relion’s post-processing 
program will lead toa realistic resolution estimate. However, care should 
be taken to use soft-edged masks for the local symmetrization, as the 
real-space mask operations may lead to artefacts in the Fourier-based 
resolution assessment. 

Another note of caution considers the assumption that all the mem- 
bers of each of the local symmetry groups are identical. Since, by defini- 
tion, the complex does not obey point group symmetry, this assumption 
can never be entirely true. At the very least, some of the subcomplex 
interfaces will be chemically different, whereas in worst-case scenarios 
biologically relevant conformational differences may exist within the 
members of each group. Imposing (local) symmetry on objects that are 
not identical will lead to a false impression of similarity in the output 
reconstruction. Besides minimizing artefacts in Fourier space, the soft 
edges of the local symmetry masks are also relevant here. Local symme- 
try is applied for all real-space pixels where the mask has values m>O, 
but the symmetrized map will be calculated as (1 - m) times the original 
reconstruction plus mtimes the average of the corresponding voxel in 
all members of the group. Thereby, mask values of m<1canbe used to 
impose local symmetry only partially. 

For the subcomplex reconstruction, masks around each of the sym- 
metric regions were created and filtered to 25 A with mask extension to 
6 pixels and a soft-edge of 10 pixels (Extended Data Fig. 7e). The trans- 
formation operator for the three Euler angles and the three translations 
that relate one symmetric region with the other were calculated using 
relion_localsym_mpi in real space, followed by 3D refinement using a 
regularization T-value of 100. This improved the overall appearance of 
the map (Extended Data Fig. 7f). 


Model building 
Model building was performed using Coot*>**. All models described 
were built as polyalanine chains except FANCL (see below). 

Ahomology model for chicken FANCL (residues 1-373) was generated 
with I-TASSER” using Drosophila FANCL (PDB 3K1L)” as a template. The 
FANCL homology model was rigidly fitted into the EM map in Chimera 
using ‘fit in map’ tool** and then flexibly fitted using Jiggle fit in Coot. 
Densities corresponding to the ELF, URD and RING domains of FANCL 
were identified in the focused map of the base region, however, we were 
not able to orient the RING domain unambiguously in the density. The 
density for the ELF domain was well-defined in the focused map for 
the top region. There was only weak density for the URD domain and 
no density for the RING domain in FANCL,,, (Extended Data Figs. 2g, 
5d). The ELF domain is next to the coiled-coil helix of FANCB, inagree- 
ment with FANCL crosslinks to the region C-terminal of the coiled-coil 
helix of FANCB (residues 439, 441, 454 and 460) and to the FAAP100 
B-propeller (residues 25, 180, 188, 262, 267 and 274). There are several 
crosslinks between FANCL URD-RING and the substrate-recognition 
module (FANCC, FANCE and FANCF) consistent with its placement within 
the base of the complex. 

The ‘multi-body refinement map’ for the middle region, the map from 
T=5 regularization, and the map for symmetric subcomplex were used 
to build the central region of FANCB-FAAP100 de novo by placing ide- 
alized helices and strands in Coot, and refining their fits with the real 
space refine zone tool. In agreement with their role as a scaffold, FANCB 
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and FAAP100 have 64 intermolecular crosslinks and they crosslink to 
four other subunits. 

One of the two B-propellers inthe top region was built de novo using 
the ‘focused map’ for the top region and this model was used to identify 
astructural homologuein DALI”. The B-propeller of Bardet-Bied| syn- 
drome 1 protein (PDB 4VON)” was the top hit, and this was also a good 
fit for the other B-propeller. The sequence of this model was changed 
to polyalanine and used to build all four B-propellers after rigid fitting 
into the ‘focused map’ for the top or base region. 

On the basis of the crosslinking patterns and secondary structure 
predictions, we assigned one of each of the pairs of B-propellers, and 
one helix of each of the coiled coils to FANCB (N-terminal region for 
B-propeller and residues ~390-—429 for the helix) and the other to 
FAAP100 (N-terminal region for B-propeller and residues ~461-501 
for the helix). These assignments agree with the hydrogen deuterium 
exchange experiments of B-100 versus B-L-100. The sequences of the 
predicted long helices of FANCB and FAAP100 were used in MARCOIL** 
for assignment of heptad repeats (Extended Data Fig. 5e). Then, a 
FANCB-FAAP100 coiled-coil model was built in CCbuilder 2.0° using 
the advanced mode (Extended Data Fig. 5f). This model was then fitted 
into the map and refined in Coot using real space refinement. 

Ahomology model for chicken FANCG (residues 1-648) was generated 
inI-TASSER using TTC7B-hyccin complex (PDB 5DSE) and O-linked Glc- 
NActransferase (PDB 1W3B) as the top two templates. The TPR region of 
the model (residues 153-432) was fitted into the focused map of the top 
region by rigid fitting in Chimera followed by Jiggle fit in Coot and refine- 
ment in Refmac. Additional helices were added towards the C-terminal 
region of FANCG using Coot. 

Ahomology model for chicken FANCF (residues 117-343) was built in 
I-TASSER using the crystal structure of human FANCF C-terminal region 
(PDB 2IQC)"* as one of the templates. Residues 117-191 and 215-343 were 
rigidly fitted into the focused map for the base region followed by Jiggle 
fit and real space refinement in Coot. In addition to the FANCF model, 
several short, idealized helices were also placed into the base. A crystal 
structure for human FANCE C-terminal region (PDB 2ILR)™ was available 
but we could not fit it inthe map, presumably due to conformational vari- 
ation. The Cterminus of FANCC is near FANCE (residues 155-190; Fig. 1d). 

The above models were assembled together into models for top 
(FANGG, B-propellers and long helices of FANCB and FAAP100), mid- 
dle and base (FANCL, FANCF, B-propellers and long helices of FANCB, 
FAAP100, unassigned helices) regions. These models were then further 
refined in Refmac® and Phenix iteratively~®. 

The models for FANCB,,,, FAAP100,.,, FANCB-FAAP100, FANCG and 
FANCL,,, built in the complete FA core complex map were rigidly fitted 
into the subcomplex map and refined in Refmac. All models and maps 
were visualized and rendered in UCSF Chimera‘ or Chimerax”. 


Native mass spectrometry 

Native mass spectrometry experiments were carried out ona Q-Exactive 
Plus UHMR modified to facilitate the transmission of high-energy spe- 
cies and adapted for membrane proteins** ©. The FA core complex at 
concentration of 9.5 iM was buffer exchanged using Bio-Spin 6 columns 
(BioRad). Typically, 2-3 pl of sample in 750 mM ammonium acetate 
were injected, using a1.2 mm outer diameter, gold-coated borosilicate 
capillary (Harvard Apparatus). The following parameters were used 
for protein transmission: capillary voltage 1.2 kV, dessolvation voltage 
-300 V, source fragmentation O V, HCD energy O V, HCD pressure 
4, EMR on, C-trap entrance lens tune offset 2, injection flatopole 8 V, 
inter flatopole lens 6 V, and bent flatopole 4 V. Threshold was set to 3. 
Data were analysed using Xcalibur 2.2 (Thermo Fisher), Masslynx 4.2 
(Waters) and SUMMIT“. The formation of some of the subcomplexes 
observed may be the result of buffer exchange from 50 mM HEPES pH 
8.0, -500 mM NaCl and 1mM TCEP into 750 mM ammonium acetate, 
whichis often used in native mass spectrometry to improve resolution 
and generate subcomplexes to aid structure determination. 


HDX-MS 

Deuterium exchange reactions were initiated by diluting the complexes 
inD,O (99.8% D,O ACROS, Sigma) to give a final D,O percentage of -95%. 
Deuterium labelling was generally carried out at 23 °C at four time points 
(3s,30s, 300s and 3,000 s) in triplicate. The labelling reaction was 
quenched by adding chilled 2.4% v/v formic acid in 2 M guanidinium 
hydrochloride and immediately frozen in liquid nitrogen. Samples were 
stored at -80 °C before analysis. 

The quenched samples were rapidly thawed and subjected to proteo- 
lytic cleavage using pepsin followed by reverse-phase high performance 
liquid chromatography separation. The proteins flowed through an 
Enzymate BEH immobilized pepsin column, 2.1 30 mm, 5 um (Waters) 
at 200 pl min” for 2 min, and the resulting peptides were trapped and 
desalted ona2.1x 5mm C18 trap column (Acquity BEH C18 Van-guard 
pre-column, 1.7 pm, Waters). Trapped peptides were eluted over 
11 min using a3-43% gradient of acetonitrile in 0.1% v/v formic acid at 
40 pI min“ ontoa reverse phase analytical column (Acquity UPLC BEH 
C18 column 1.7 um, 100 mm x 1mm (Waters)). The liquid chromatog- 
raphy elute was coupled to aSYNAPT G2-Si HDMS mass spectrometer 
(Waters) and data were acquired over a m/z of 300 to 2,000, using the 
standard electrospray ionization (ESI) source with lock mass calibration 
using [Glu1]-fibrino peptide B (50 fmol I”). The mass spectrometer was 
operated inion mobility mode, at asource temperature of 80 °Canda 
spray voltage of 2.6 kV. Spectra were collected in positive-ion mode. 

Peptide identification was performed with a non-deuterated sam- 
ple using MSE (Waters) to fragment peptides. An identical gradient of 
increasing acetonitrile in 0.1% v/v formic acid over 11 min was used and 
the resulting MS* data were analysed using Protein Lynx Global Server 
software (Waters) with an MS tolerance of 5 ppm. 

Mass analysis of the peptide centroids was performed using DynamX 
sotware (Waters). Only peptides with a score >6.4 were considered. All 
peptides (deuterated and non-deuterated) were manually verified at 
every time point for the correct charge state, presence of overlapping 
peptides and correct retention time. Deuterium incorporation was 
not corrected for back-exchange and represents relative, rather than 
absolute changes in deuterium levels. Changes in H/D amide exchange 
in any peptide may be due to a single amide or several amides within 
that peptide. 


Crosslinking mass spectrometry of purified FA core complex 
The purified FA core complex (SO mM HEPES pH 8.0, ~S00 mM NaCl and 
ImMTCEP) at aconcentration of 7.6 1M was crosslinked with 100-fold 
molar ratio of disulfosuccinimidyl suberate (BS3) for 2h onice andthe 
reaction was quenched with 50 mM NH,HCO, for 30 min at roomtem- 
perature. The crosslinked samples were cold-acetone precipitated and 
resuspended in8 Murea and 100 mM NH,HCO,,. Peptides were reduced 
with 10 mM DTT andalkylated with 50 mM iodoacetamide. Following 
alkylation, proteins were digested with Lys-C (Pierce) at an enzyme-to- 
substrate ratio of 1:100 for 4 h at 22 °C and, after diluting the urea to 
1.5M with 100 mM NH,HCO,; solution, further digestion with trypsin 
(Pierce) at an enzyme-to-substrate ratio of 1:20. 

Digested peptides were eluted from StageTips and split into two, for 
parallel crosslink enrichment by strong cation-exchange chromatog- 
raphy (SCX) and size exclusion chromatography (SEC), and were dried 
ina vacuum concentrator (Eppendorf). For SCX, eluted peptides were 
dissolved in mobile phase A (30% acetonitrile (v/v), 10 mM KH,PO,, 
pH 3) before strong cation exchange chromatography (100 x 2.1mm 
Poly Sulfoethy! A column; Poly LC). The separation of the digest used a 
nonlinear gradient® into mobile phase B (30% acetonitrile (v/v), 10 mM 
KH,PO,, pH 3, 1M KCl) at a flow rate of 200 pI min”. Ten1-min fractions 
inthe high-salt range were collected and cleaned by StageTips, eluted 
and dried for subsequent liquid chromatography with tandem mass 
spectrometry (LC-MS/MS) analysis. For peptideSEC, peptides were 
fractionated on an AKTA Pure system (GE Healthcare) using a Superdex 


Peptide 3.2/300 (GE Healthcare) at a flow rate of 10 I min“ using 30% 
(v/v) acetonitrile and 0.1% (v/v) trifluoroacetic acid as mobile phase. 
Five 50-ul fractions were collected and dried for subsequent LC-MS/ 
MS analysis. 

Samples for analysis were resuspendedin 0.1% v/v formic acid, 1.6% v/v 
acetonitrile. LC-MS/MS analysis was conducted in duplicate for SEC frac- 
tions and triplicate for SCX fractions, performed onan Orbitrap Fusion 
Lumos Tribrid mass spectrometer (Thermo Fisher Scientific) coupled 
on-line with an Ultimate 3000 RSLCnano system (Dionex, Thermo Fisher 
Scientific). The sample was separated and ionized by a50 cm EASY-Spray 
column (Thermo Fisher Scientific). Mobile phase A consisted of 0.1% 
(v/v) formic acid and mobile phase B of 80% v/v acetonitrile with 0.1% v/v 
formicacid. Flow-rate of 0.3 pl min“ using gradients optimized for each 
chromatographic fraction from offline fractionation ranging from 2% 
mobile phase B to 45% mobile phase B over 90 min, followed by a linear 
increase to 55% and 95% mobile phase Bin 2.5 min, respectively. The MS 
data were acquired in data-dependent mode using the top-speed setting 
witha three second cycle time. For every cycle, the full scan mass spec- 
trum was recorded inthe Orbitrap at a resolution of 120,000 inthe range 
of 400 to 1,600 m/z. lons witha precursor charge state between 3+ and 
7+ were isolated and fragmented. Fragmentation by higher-energy colli- 
sional dissociation (HCD) employed a decision tree logic with optimized 
collision energies®. The fragmentation spectra were then recorded in 
the Orbitrap witha resolution of 50,000. Dynamic exclusion was enabled 
with single repeat count and 60-s exclusion duration. 

Arecalibration of the precursor m/z was conducted based on high- 
confidence (<1% false discovery rate (FDR)) linear peptide identifica- 
tions. The recalibrated peak lists were searched against the sequences 
and the reversed sequences (as decoys) of crosslinked peptides using 
the Xi software suite (v.1.6.746)® (https://github.com/Rappsilber-Lab- 
oratory/XiSearch) for identification. The following parameters were 
applied for the search: MS1 accuracy =3 ppm; MS2 accuracy =10 ppm; 
enzyme =trypsin (with full tryptic specificity) allowing up to four missed 
cleavages; crosslinker = BS3 with an assumed reaction specificity for 
lysine, serine, threonine, tyrosine and protein N termini; fixed modifica- 
tions = carbamidomethylation on cysteine; variable modifications = oxi- 
dation on methionine, hydrolyzed/aminolyzed BS3 from reaction with 
ammonia or water ona free crosslinker end. The identified candidates 
were filtered to 1% FDR on link level using XiFDR v.1.1.26.58%. 


Pull down assay 

The purified subcomplexes A-G-B-L-100 and C-E-F (Strep Il tag cleaved 
by 3C protease) were mixed in a 1:1 molar ratio at concentrations of 
1.4 uMeach, for lhat 4 °C. A 20-pl reaction was incubated with 15 pl of 
StrepTactin beads (GE Healthcare) equilibrated inS5O mM HEPES pH 8.0, 
300 mM NaCl and 1mM TCEP. The flow through was collected (unbound 
fraction) and the beads were washed three times with equilibration 
buffer. The unbound and bound fractions were analysed on SDS-PAGE. 


Reporting summary 
Further information on research design is available in the Nature 
Research Reporting Summary linked to this paper. 


Data availability 


Cryo-EM maps generated during this study have been deposited in 
the Electron Microscopy Data Bank with accession codes EMD-10290 
(FA core complex consensus), EMD-10291 (focused classification top 
region), EMD-10292 (focused classification middle region), EMD-10293 
(focused classification base region) and EMD-10294 (subcomplex). 
Models generated during this study have been deposited in the Pro- 
tein Data Bank (PDB) with accession codes 6SRI (FA core complex) and 
6SRS (subcomplex). Native mass spectrometry data are available from 
figshare at https://doi.org/10.6084/m9.figshare.9692192. Crosslinking 
mass spectrometry data have been deposited in the PRIDE database 


with accession code PXDO14282. All other data are available from the 
authors upon reasonable request. 
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Extended Data Fig. 1| Recombinant FA core complex activity. 

a, Ubiquitination assay analysed by western blot with HA antibody to detect 
HA-tagged ubiquitin. The migration positions of monoubiquitinated FANCD2 
and FANCIare indicated but FANClis not substantially modified. b, 
Ubiquitination assay analysed by Coomassie-stained SDS-PAGE to show 
specific monoubiquitination of FANCD2 K563 by recombinant FA core 
complex. Wild type (WT), FANCD2(K563R) and FANCI(K525R) (KR) were 
analysed. A native FA core complex purified from chicken DT40 cells 
monoubiquinates FANCD2 but does not efficiently monoubiquitinate FANCI*. 
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Therefore, the purified recombinant complex faithfully recapitulates the 
properties of the native chicken complex. Notably, a purified human complex 
also did not efficiently monoubiquitinate FANCI, although it did efficiently 
monoubiquitinate FANCD2". The asymmetry in the FA core complex (see 
below) reflects this asymmetry in its activity on FANCD2-FANCI. An additional 
factor or post-translational modification may be required for activation of 
FANCI monoubiquitination. The ubiquitination assays were repeated at least 
two times independently with similar results. For gel source data, see 
Supplementary Fig. 1. 
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Extended Data Fig. 2|See next page for caption. 


Extended Data Fig. 2| Cryo-EM reconstruction of FA core complex, multi- 
body refinement and assessment of 3D reconstructions after focused 
refinement. a, Representative raw micrograph of FA core complex. b, Overall 
3D reconstruction of the FA core complex.c, Angular distribution density plot of 
particles used in the 3D reconstruction of the FA core complex. Every pointisa 
particle orientation and the colour scale represents the normalized density of 
views around this point. The colour scale runs from 0 (low, blue) to 0.00026 
(high, red). The efficiency of orientation distribution”, Epp, was 0.79. 

d, Estimated local resolution map for FA core complex. e, FSC plot for gold 
standard refinement. f, Multibody refinement of the FA core complex using 


three masks (body 1, body 2 and body 3) shown in pink, purple and cyan, 
respectively. The motions are shown in Supplementary Videos 1and 2. g, Local 
resolution maps for reconstructions of top, middle and base regions of FA core 
complex. The middle region did not substantially change between multi-body 
refinement and particle subtraction followed by focused classification. The 
resolution of the base and top regions improved after particle subtraction and 
focused classification and refinement. h, FSC plot for gold standard 
refinements. i, Representative density for B-strand and a-helical regions. 
FANCB-FAAP100 is in the middle region which is better defined than more 
peripheral regions, including FANCF. 
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Extended Data Fig. 3 | See next page for caption. 


Extended Data Fig. 3 | Subunit assignment and arrangement in FA core 
complex. a-c, Complexes lacking specific subunits were purified and analysed 
by cryo-EM.a, Major 2D class averages identified for subcomplexes, compared 
with those from FA core complex (A-G-B-L-100-C-E-F). Cartoons are shown to 
depict the subunits visible in the class averages. The symmetric subcomplex 
identified in the FA core complex preparation is indicated (sym). The A-G- 
B-L-100 complex (lacking the substrate-recognition module) has similar 
symmetric 2D classes. Native mass spectrometry revealed anon-uniform 
subunit stoichiometry. Thus itis likely that the asymmetric assembly represents 
the complete FA core complex, while the symmetric structure is asubcomplex 
that co-purifies with the intact complex. The 2D class average of acomplex 
lacking FANCA (G-B-L-100-C-E-F) appeared similar to the complete FA core 
complex with no obvious missing density. FANCG is probably the partially 
disordered arm that extends from the central part of the complex since this was 
missing when FANCG was not present in the complex. b, Coomassie-stained 
SDS-PAGE analysis of purified subcomplexes. Asterisks indicate contaminant 
proteins. FANCA did not co-purify with the A-B-L-100 complex but its migration 
position is indicated on the gel. The purifications were repeated at least two 


times independently with similar results. For gel source data, see 
Supplementary Fig. 1.c, Cartoon of FA core complex with subunits labelled. 

d, Native mass spectrum of recombinant FA core complex showing masses and 
subunit composition of assigned peak series. We dissociated the FA core 
complex into subcomplexes during ionization and these species were detected 
by mass spectrometry. Computational analyses then revealed the proteins 
present in each of the peaks. The standard deviation in fitting the identified 
peaks tothe charge series is given as the + error inthe measured mass for agiven 
single measurement. This is the error in the fit and not the error inthe mass 
measurement, which is probably an order of magnitude higher due to, for 
instance, solvation or adduct effects, or heterogeneous post-translational 
modifications. Hence, the error gives arough measure of the accuracy of peak 
assignment, whichis impacted by the broadness, symmetry and signal-to-noise 
ratio of each peak. e, Molecular masses of FA core complex subunits. The 
expected and measured masses are given, along with phosphorylation sites 
detected by mass spectrometry. Native mass spectrometry was repeated three 
times with similar results. 
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Extended Data Fig. 4| See next page for caption. 
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Extended Data Fig. 4 | Crosslinking mass spectrometry. a, Crosslinking mass 
spectrometry revealed 834 crosslinks (1% FDR) between residues that arein 
close proximity. Intermolecular crosslinks are coloured in green; intramolecular 
crosslinks, red; predicted a-helices, orange; predicted B-strands, purple. 
Asterisks on FANCB mark missense mutations from the Fanconi Anaemia 
Mutation Database (http://www2.rockefeller.edu/fanconi/) including L43Sina 
predicted B-strand, P230S at the N terminus of a predicted helix, and L329P in 
the middle of a predicted B-strand, all in the B-propeller; and L676P, whichis 
predicted to disrupt an a-helix in the C-terminal dimerization domain. 

b-d, Validation of crosslinking of FA core complex. Crosslinks were mapped 
ontoa homology model of chicken FANCL (b), human FANCF (c) and human 
FANCE (d). Allintramolecular crosslinks within a domain are consistent with the 


maximum crosslinker length (30 A between thetwo Ca). Crosslinks between the 
domains in FANCL are not consistent with the domain arrangementin the crystal 
structure of FANCL because there is flexibility or changes in the orientation 
between the domains in the FA core complex. There are two different 
conformations of FANCL in the structure: FANCL,,..is fully ordered, whereas 
only the ELF domainis visible in FANCL,,,. Mapping the crosslinks onto FANCL 
from the base (constrained, with all three domains visible) reveals that the 
distances for some crosslinks between domains are too large to be consistent 
with the FANCL conformation in the base. By contrast, for FANCLin the top, 
where only the ELF domain is ordered, the URD and RING domainsare likely tobe 
conformationally flexible. Since the URD and RING domains cannot be modelled 
for FANCL,,,, it is not possible to validate these interdomain crosslinks. 
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Extended Data Fig. 5 | Assessment of model fit in maps and modelling of 
coiled coils. a—c, FSC plots of maps versus model for top (a), middle (b) and 
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base (c) regions. d, Low-resolution cryo-EM map of FA core complex 
(transparent surface) with models placed inthe map. Asterisks represent 
density that was not visible at high resolution and was notinterpreted. Thismay complexcrosslinking mass spectrometry are indicated with blue lines. 
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represent FANCA and additional parts of the substrate-recognition module.e, 
MARCOIL* prediction for the best heptad phase in the long helices of FANCB 
and FAAP100. f, Predicted coiled-coil model by CCbuilder 2.0° for the FANCB 
and FAAP100 long helices. Crosslinks detected for these helices in the FA core 
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Extended Data Fig. 6 |HDX-MS on FANCB and FAAP100. a-d, Difference plots 
for FANCB showing peptides that are protected (negative) or exposed 
(positive) upon binding of additional subunit(s) for B-L-100 vs B-100 (a), 
G-B-L-100 vs B-L-100 (b), A-G-B-L-100 vs G-B-L-100 (c) and FA core complex vs. 
G-B-L-100 (d). Exchange of hydrogens in FANCB residues 429-448 was 
protected after interaction with FANCL, consistent with FANCL being located 
next to the coiled coil. e-h, Difference plots for FAAP100 showing peptides that 
are protected (negative) or exposed (positive) upon binding of additional 


GBL100 vs. BL100 
Total: 98 Peptides, 69.9% Coverage, 1.68 Redundancy 


Exposed by 


V Protected by 
g FANCG 
& 

x 


FAcore vs. GBL100 
Toial: 125 Peptides, 76.6% Coverage, 1.85 Redundancy exposed by FANCA- 


FANCC-FANCE- 
FANCF 


Protected by 

FANCA- 
FANCC-FANCE- 
FANCF 


—___——_____: ao 


226-238 


GBL100 vs. BL100 


Total: 91 Peptides, 76.9% Coverage, 1.66 Redundancy 
264-285 


Exposed by 


448-463 


Protected by 
FANCG 


FAcore vs. GBL100 
: 120 Peptides, 78.3% Coverage, 2.05 Redundancy — Exposed by FANCA- 
FANCC-FANCE- 
FANCF 


Protected by FANCA- 
FANCC-FANCE- 
FANCF 


subunit(s) for B-L-100 vs B-100 (e), G-B-L-100 vs B-L-100 (f), A-G-B-L-100 vs. 
G-B-L-100 (g) and FA core complex vs G-B-L-100 (h). Exchange of hydrogens in 
FAAP100 residues 448-464 was protected after interaction with FANCL, 
consistent with FANCL being located next to the coiled coil. For difference 
plots, triplicate data from four independent colour-coded time points are 
shown. The significance threshold is indicated by dashed lines. Grey shading 
indicates the standard deviation of all charge states and replicates per peptide. 
Sequence coverage is shown in the Supplementary Information. 
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Extended Data Fig. 7 |3D reconstruction of symmetric FA subcomplex and 
local symmetry refinement. a, Overall 3D reconstruction of the symmetric FA 
subcomplex.b, Angular distribution plot of particles used inthe 3D 
reconstruction of the symmetric FA subcomplex. Every point is a particle 
orientation and the colour scale represents the normalized density of views 
around this point. The colour scale runs from 0 (low, blue) to 0.00026 (high, 
red). The efficiency of orientation distribution”, Foy, was 0.65.c, Estimated local 
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resolution map for symmetric FA subcomplex. d, FSC plot for gold standard 
refinement. e, Local symmetry pipeline for reconstruction of the symmetric FA 
subcomplex (see Methods). This reconstruction could not be improved with C2 
symmetry, probably because of local flexibility. f, FSC plot for gold standard 
refinement shown for the subcomplex reconstruction before and after local 
symmetry refinement. The circular panels showrepresentative densities before 
and after local symmetry refinement. 
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Extended Data Fig. 8 | See next page for caption. 
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Extended Data Fig. 8 | Model for FAsubcomplex. a, Model of FA subcomplex 
shownascartoon representations of subunits fit into the cryo-EM map. b, Model 
of FAsubcomplex shown as a surface representation of the combined models. 
Two views are shown down the two-fold symmetric axis. c, Comparison of FA 
subcomplex and complete FA core complex inthe same orientations. Both 
models are shown in cartoon representation. Subunits are coloured asin Fig. le. 
d, Modelling ofa fully symmetric FA core complex containing two copies of 
every subunit. Left, the second copy of FANCG (cartoon) from the FA 
subcomplex was modelled onto the structure of the FA core complex (surface 
representation). This second FANCG clashes with the FANCB B-propeller inthe 
base (asterisks). Thus, it is likely that upon binding of the substrate-recognition 
module, rearrangement of the B-propellers of FANCB and FAAP100 prevents 
binding of asecond copy of FANCG. (Right) Asecond copy of substrate- 
recognition module (FANCC-FANCE-FANCF; cartoon) is modelled inthe top 
region of the FA core complex by combining the models of FANCC-FANCE- 
FANCF and FANCL,,;¢ from FA core complex followed by superimposing 
FANCL,,.-¢0n FANCL,,,. There is a clash (asterisks) between the modelled FANCL 
(cartoon) and FAAP100 B-propeller (surface representation). These data 
suggest thata fully symmetric complex does not readily form. Inagreement with 
this, there was no evidence for any classes containing two copies of C-E-F inany 
of our EM analyses of the FA core complex. e, f, The symmetric FA subcomplex 


(A-G-B-L-100) does not readily associate with purified substrate-recognition 
module (C-E-F) to form the asymmetric FA core complex. e, The 2D class 
averages of A-G-B-L-100 mixed with C-E-F compared with complete FA core 
complex, FAsubcomplex and A-G-B-L-100 subcomplex. A-G-B-L-100 was mixed 
with C-E-F in molar ratios of 1:1and 1:2 for Lhat 4 °C before cryo-plunging. Only 
the symmetric A-G-B-L-100 subcomplex was observed and there was no 
additional density for C-E-F. Panels for FA core complex, subcomplex and A-G- 
B-L-100 are replicated from Extended Data Fig. 3a. f, Pull-down assay of C-E-F 
using tagged A-G-B-L-100 (Strep II tagged). Left, the Coomassie Blue-stained gel 
of purified C-E-F (with Strep II tag, after 3C cleavage of tag and after removal of 
3C protease). Tagged A-G-B-L-100 was immobilized on StrepTactin resin and 
incubated with purified C-E-F at a1:1 molar ratio. After washing, only a small 
amount of C-E-F remains bound to the beads. Negative controls (A-G-B-L-100 
only and C-E-F only) are shown in the middle panel. Asterisk indicates a 
contaminant protein. The pulldown experiment was repeated twotimes 
independently with similar results. Since C-E-F does not efficiently bind A-G- 
B-L-100, these experiments suggest that these species are unlikely to bein 
equilibrium in solution. Previous genetic and biochemical data show that 
FANCC, FANCE and FANCF are important for monoubiquitination of the 
FANCD2-FANCI substrate. Together, these data provide evidence that the 
asymmetric complex is the relevant, functional, physiological entity. 
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Extended Data Fig. 9 | See next page for caption. 
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Extended Data Fig. 9 | Structural comparison of E3 ligases. a, Thereisa 
strong precedence for dimerization of RING/U-box domain E3 ubiquitin 
ligases®*”°, RING/U-box E3s exist both as homo- and heterodimeric complexes, 
for example, Rad18-Rad18, CHIP-CHIP, RNF8-RNF8, BRCA1-BARD1, RING1b- 
BMI1and Hdm2-Hdmx’’ 244, Structures of homo- and heterodimeric RING/U- 
box E3 ligases are shown here with the RING/U-box in orange. Surprisingly, 
these E3s display functional and structural asymmetry: in all the dimers listed 
above, only one protomer binds to an E2 enzyme. The homodimeric CHIP E3 
ligase has a strikingly asymmetric structure that clearly demonstrates why only 
one U-box binds E2 enzyme”. The FANCL RING subunit is also an asymmetric 
dimer within the FA core complex and it is possible that only one of these binds 
E2. However, unlike the smaller E3 s, the FANCL RING fingers are not near each 
other. Together, this suggests that asymmetric dimerization may bea general 
feature of RING E3s. b, Comparison of FA core complex with cullin-RING 
ubiquitin ligases (CRLs). Many large complexes are predominantly helical 
suggesting that a-helices are commonly used as building blocks for complexes. 
In addition, B-propellers often mediate protein-protein interactions. The CRL 
complexes and FA core complex are long and extended with substrate- 
recognition (green), scaffold (yellow) and RING (orange) subunits residing in 
three different regions of the structure. However, the structural details of these 
complexes differ. Interestingly, the activities of some multisubunit RING- 
containing E3 ligases including APC/C and CRL complexes are stimulated by 
dimerization”>. Thus, dimerization may underpin physiological 
ubiquitination activity in many E3 ligases. c, d, Ubiquitin discharge assay, in 


which free lysine is used instead of the FANCD2-FANCI substrate. In these 
experiments, the FA core complex is incubated with E1, E2, ubiquitin and free 
lysine. If FANCL is active without a substrate, ubiquitin will be conjugated to 
lysine, resulting ina shift in its molecular weight; however, if substrate binding 
is required to activate the E3 ligase activity (for example, through allosteric 
changes), this will not occur. Coomassie gels of reaction products were runin 
non-reducing (c) and reducing (100 mM DTT) conditions (d). Ubiquitin is 
transferred to free lysine as shown by the increase in molecular weight of 
ubiquitin as well as a decrease in intensity of the E2-ubiquitin band when 
compared tothe lane containing no free lysine. Thus, substrate binding is not 
required for activity. Reducing conditions do not eliminate the UBE2T- 
ubiquitin conjugate, as previously shown”. Additionally, DNA is not required 
for FA core complex E3 ligase activity on free lysines, suggesting that DNA 
activates the substrate, not the E3. The ubiquitin discharge assays were 
repeated three times independently with similar results (c, d). 

e, Distributions of patient mutations are indicated onthe FA core complex by 
heat map colouring of subunits and in percentage. f, Ubiquitination assay using 
several subcomplexes (Extended Data Fig. 3b) and the full FA core complex, 
analysed by western blot with HA antibody to detect HA-tagged ubiquitin. The 
migration positions of monoubiquitinated FANCD2 and FANClare indicated but 
FANClis not substantially modified, as in Extended Data Fig. 1a. All complexes 
have similar activities but isolated FANCL is less active. This assay was repeated 
at least two times independently with similar results. For gel source data, see 
Supplementary Fig. 1. 


Extended Data Table 1| Cryo-EM data collection, refinement and validation statistics 


FA core complex 
(EMD-10290: consensus, 
EMD-10291: top, 
EMD-10292: middle, 
EMD-10293: base) 


Subcomplex 
(EMD-10294) 
(PDB 6SRS) 


(PDB 6SRI) 


Data collection and processing 


Magnification 

Voltage (keV) 

Electron exposure (e-/A’) 
Defocus range (um) 

Pixel size (A) 


Symmetry imposed 
Initial particle images (no.) 
Final particle images (no.) 
Map resolution (A) 


FSC threshold 
Map resolution range (A) 


Refinement 
Initial model used 


Model resolution (A) 


FSC threshold 
Model resolution range (A) 
Map sharpening B factor (A?) 


Model composition 
Non-hydrogen atoms 
Protein residues 
Ligands 

B factors (A?) 
Protein 
Ligand 

R.m.s. deviations 
Bond lengths (A) 
Bond angles (°) 

Validation 
MolProbity score 
Clashscore 
Poor rotamers (%) 

Ramachandran plot 
Favored (%) 
Allowed (%) 
Disallowed (% 


75,000 X 
300 


1.040 (LMB) 
1.085 (eBIC) 


4.2 (consensus), 4.5 (top), 
4.4 (middle), 4.9 (base) 
0.143 

4.2 to> 10 


De novo modelling and 
homology modelling 
4.5 (top), 4.4 (middle), 
4.9 (base) 

0.143 

n/a 

-149 (consensus) 

-198 (top) 

-190 (middle) 

-177 (bottom) 


15,309 

3,827 

0 

not estimated 


0.23 
0.48 


75,000 X 
300 


1.040 (LMB) 
1.085 (eBIC) 


0.143 
4.6 to> 10 


De novo modelling and 
homology modelling 
4.6 


0.143 
n/a 
-213 


12,424 
3,106 
0 


not estimated 


0.22 
0.50 
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Extended Data Table 2 | Features of individual FA core complex subunits and structural models 


Protein 


FANCA 


FANCB 


FANCC 


FANCE 


FANCF 


FANCG 


FANCL 


FAAP100 


Length 
(aa) 


1,421 


867 


559 


520 


350 


648 


373 


888 


Sequence features 


a-helical 


Possible B-propeller, plus a- 
helical 


a-helical 


a-helical 

Crystal structure of human 
orthologue (C-terminal half; 
PDB 2ILR) 


a-helical 

Crystal structure of human 
orthologue (C-terminal half; 
PDB 2IQC) 


a-helical (TPR) 


ELF, URD and RING domains. 
Crystal structures of human 
(central domain, PDB 3ZQS; 
RING domain, PDB 4CCG) 
and Drosophila orthologues 
(full-length, PDB 3K1L) 


Possible B-propeller, plus a- 
helical 


Models generated in this 
study 


N/A 


De novo modelling of 
SSEs* 


SSEs placed in maps, not 
assigned 


SSEs placed in maps, not 
assigned 


C-terminal region from 
homology model based on 
PDB 2IQC 


TPR domain from 
homology model (I- 
TASSER) 


ELF domain in FANCLiop. 
ELF, URD and RING 
domains in FANCLoase. 
Homology models based 
on PDB 3K1L 


De novo modelling of 
SSEs 


Maps used for modelling 


N/A 


Focused map for top 
(EMD-10291), middle 
(EMD-10292), and base 
regions (EMD-10293), 
consensus map with T=5 
(EMD-10290) and subcomplex 
map (EMD-10294) 


Focused map for base 
(EMD-10293) 


Focused map for base 
(EMD-10293) 


Focused map for base 
(EMD-10293) 


Focused map for top region 
(EMD-10291) 


Focused map for top 
(EMD-10291) and base 
(EMD-10293) 


Focused map for top 
(EMD-10291), middle 
(EMD-10292), and base 
regions (EMD-10293), 
consensus map with T=5 
(EMD-10290) and subcomplex 
map (EMD-10294) 


Sequence identity / 
similarity between 
Gallus gallus and 
Homo sapiens (%) 


49.1/ 65.6 


44.1/63.2 


49.0/65.1 


40.9/54.1 


38.5 / 51.6 


37.3 / 52.9 


69.9 / 82.7 


55.6 / 68.7 


SSE, secondary structure elements 
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Statistics 


For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section. 


n/a | Confirmed 


The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement 


A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly 


The statistical test(s) used AND whether they are one- or two-sided 
Only common tests should be described solely by name; describe more complex techniques in the Methods section. 


A description of all covariates tested 


A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons 


O A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) 
AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) 


For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted 
“! Give P values as exact values whenever suitable. 


For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings 


For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes 


Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated 


Our web collection on statistics for biologists contains articles on many of the points above. 


Software and code 


Policy information about availability of computer code 


Data collection CryoEM data collected on Titan Krios microscope was performed with EPU (FEI/Thermo Fisher Scientific) 


Data analysis Relion v2, Relion v3.0-beta, Eman2, MotionCor2, gCTF, ResMap, Coot, I-TASSER, Refmac, Marcoil, CCbuilder2.0, Xcalibur 2.2 (Thermo 
Fisher), Masslynx 4.2 (Waters), SUMMIT, Protein Lynx Global Server software (Waters), DynamX sotware (Waters), Xi software suite 
(version 1.6.746) and XiFDR version 1.1.26.58, UCSF Chimera, ChimeraXx, Phenix 


For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. 
We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. 


Data 


Policy information about availability of data 
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: 


- Accession codes, unique identifiers, or web links for publicly available datasets 
- A list of figures that have associated raw data 
- A description of any restrictions on data availability 


CryoEM maps generated during this study have been deposited in the Electron Microscopy Data Bank (EMDB) with accession codes EMD-10290 (FA core complex 
consensus), EMD-10291 (focused classification top region), EMD-10292 (focused classification middle region), EMD-10293 (focused classification base region) and 
EMD-10294 (subcomplex). Models generated during this study have been deposited in the protein databank (PDB) with accession codes 6SRI (FA core complex) and 
6SRS (subcomplex). Native MS data is available from figshare with accession code: 10.6084/m9.figshare.9692192. Crosslinking MS data has been deposited in the 
PRIDE database with accession code PXD014282. All other data are available from the authors upon reasonable request. 
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Field-specific reporting 


Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection. 


x] Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences 


For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf 


Life sciences study design 


All studies must disclose on these points even when the disclosure is negative. 


Sample size Sample sizes were chosen based on previous experience and published studies to evaluate reproducibility of assays. For cryo-EM, the initial 
number of particles was ~1,950,000, which was sufficient to obtain the stated resolution after 3D classification . 
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Data exclusions No data were excluded. 


Replication All experiments (purifications, ubiquitination assays, pulldowns, nativeMS) were performed at least two or three times (exact number of 
replicates given in text). All attempts to replicate results were successful. 


Randomization | Randomization is not relevant to the experiments performed in this study. 


Blinding Blinding is not relevant to the experiments performed in this study. 


Reporting for specific materials, systems and methods 


We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, 
system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response. 


Materials & experimental systems Methods 

n/a | Involved in the study n/a | Involved in the study 
Antibodies ChIP-seq 
Eukaryotic cell lines Flow cytometry 
Palaeontology MRI-based neuroimaging 


Animals and other organisms 


Clinical data 


| 
| 
Human research participants 
| 


Antibodies 


Antibodies used HA-probe (F-7) HRP monoclonal, Santa Cruz, Cat# sc-7392HRP, Lot# H3017, dilution 1:1000 


Validation In this study, the HA antibody was used in Western blots to probe HA-tagged Ubiquitin. Western blot analysis of HA-tagged 
fusion proteins showing N-terminal HA-tagged JNK2 and JNK1 and C-terminal HA-tagged Daxx was performed by the 
manufacturer. In addition, we could verify the Western blotting results by Coomassie blue staining. 


Eukaryotic cell lines 


Policy information about cell lines 


Cell line source(s) Sf9, Oxford Expression Technologies Ltd, Cat No. 600100 
Authentication Cell line was not authenticated. 
Mycoplasma contamination Cell line was negative for mycoplasma. 


Commonly misidentified lines No commonly misidentified cell lines were used. 
(See ICLAC register) 
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The Linnean Society of London first admitted women in 1905. 


CAREERS AND CONTROVERSY 
BEFORE THE FIRST WORLD WAR 


For decades after Nature’s launch in 1869, women’s contributions to science 
were played down by both the journal and wider society. By Claire Jones 


nits 150 years of existence, Nature has wit- 

nessed the emergence of scienceasa profes- 

sion. But as research moved froma domestic 

toaninstitutional setting, women became 

increasingly invisible, andthe historical nar- 
rative became resolutely male. 

laim to redress the balance by identifying 
the barriers that women faced and how they 
worked around them, gaining access to 
scientific education and chipping away at 
societies, journals and universities. Gradu- 
ally, they widened the corridors of power for 
those who followed. 

My focus is narrow — the United Kingdom 
inthe late nineteenth and early twentieth cen- 
turies — but this was Nature’s heartland in its 
first 50 years. And, for better or for worse, the 


British Empire provided a backdrop for scien- 
tific research in that era. 

Wherever we look, women have been mostly 
absent from the story of science. To retrace 
the steps of these workaday women — not all 
heroines — of science is to understand how far 
we have travelled towards equity in the scien- 
tific workforce. 

You could be forgiven for thinking that 


“Acrimony was not unusual 
when the question of 
women’s admission to 
societies was raised.” 
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there was no such thing as a career in science 
for women before the mid-twentieth century. 
Our popular understanding of science as an 
essentially female-free zone for most of its 
existence is seldom challenged. 

Yet women adopted various scientific guises 
before Nature was founded, and even occasion- 
ally appeared onits pages inits early years. This 
is not to say that science was a female-friendly 
career; serious prejudice and discrimination 
severely limited women’s opportunities. How- 
ever, recognizing the women who contrib- 
uted to the enterprise despite these barriers 
debunks the myth that science was (and is) 
inherently male. 

Early inthe nineteenth century, women used 
spaces seen as more appropriately ‘feminine’ to 
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negotiate a way into science. Science writing, 
especially for children or popular audiences, 
scientific illustration and translation were all 
comfortable niches in which women could 
participate without threatening male pre-em- 
inence or ideals of femininity. 

Michael Faraday famously credited British 
science writer Jane Marcet’s Conversationson 
Chemistry (1805) for inspiring him to take up 
science. Marianne North was a noted botanical 
illustrator, scientist and discoverer of plants. 
Later, astronomer Agnes Clerke negotiated a 
successful career as a writer of popular books 
onastronomy inthe 1880s and 1890s, winning 
the Royal Institution’s Actonian Prize in 1893. 


Learned societies 


At the time of Nature’s launch, most learned 
societies were male-only. In 1991, science 
historian Londa Schiebinger at Stanford 
University in California noted that for 300 
years, the only permanent female presence 
at the Royal Society was a skeleton preserved 
in the anatomy cupboard’. In common with 
other elite scientific bodies, thesociety resisted 
admitting womenas fellows until 1945, 26 years 
after the Sex Disqualification (Removal) Act 
1919 was passed. Among other things, the act 
decreed that “a person shall not be disquali- 
fied by sex or marriage ... for admission to any 
incorporated society (whether incorporated by 
Royal Charter or otherwise)”. 

Nature was quick to rebuke the French Acad- 
emy of Sciences? when it denied admission to 
physicist and chemist Marie Curiein1911—even 
though she had wona Nobel prize eight years 
previously. “It is incomprehensible ... on any 
ethical principles of rightness and justice,” 
Nature wrote, “that because Curie happens to 
bea woman she should be denied the laurels 
which her pre-eminent scientific achievement 
has earned for her.” 

Women fought back, too. Around 1900, 
there was a concerted effort by a group led by 
evolutionary botanist Marian Farquharson, 
to gain admission to scientific societies. After 
strong debate between the fellows, 11 women 
were admitted to the Linnean Society in1905. 
The society got its own back on Farquharson, 
however, by rejecting her application. She had 
to wait until 1908, when objections had died 
down, to be elected. 

Acrimony was not unusual when the ques- 
tion of women’s admission to societies was 
raised. When the Royal Geographical Society 
considered the issue in the decades around 
1900, heated argument between fellows and 
members of the society’s council broke outin 
the letters page of The Times. Exclusion from 
learned societies hindered women’s access to 
networks, libraries, grants and collaboration, 
and made the career landscape very different 
for women than for men. 

Whytherawantipathy to women? Onereason 
was that science itself often taught ideas — now 
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discredited — that there were innate differences 
in intelligence between the sexes that would 
limit women’s suitability for science. Darwin 
argued that evolutionary competition led to 
the higher development of male brains and of 
female emotions. 

As a result, people saw the admission 
of women as threatening to dumb down 
proceedings and harm the status of elite 


societies. Thomas Henry Huxley, a biologist 
and anthropologist who earned the sobriquet 
‘Darwin’s bulldog’ for his advocacy of evolu- 
tion, worked to prevent women’s admission 
to the Geological Society and the Ethnologi- 
cal Society of London, explicitly to preserve 
society status and prestige’. Ideologically 
informed theories of male and female brains 


and resulting intellectual deficit are remarka- 
bly persistent, as neuroscientist Gina Rippon 
demonstrates in her 2019 book The Gendered 
Brain, which uses science to demolish these 
ideas. Rippon criticizes, in particular, modern 
evolutionary psychology and brain studies 
that look for differences between the sexes 
and, when they find it, consider only biologi- 
cal explanations. 

However, the impact of these views — on 
women who were (and have been) internal- 
izing them, and on the scientific community 
at large — cannot be ignored. Mathematician 
and astronomer Mary Somerville, widely cel- 
ebrated in her time, remarked in entries in Per- 
sonal Recollections, from Early Life to Old Age, 
of MarySomerville, published posthumously in 
1874, that she had “no originality ... that spark 
from heaven is not granted to the [female]”. A 
review‘ of her book in Nature identifies Somer- 
ville’s genius as “wholly exceptional”, because 
“women are not by nature adapted for studies 
which involve the higher processes of induc- 
tion and analysis”. Despite her unique scientific 
bent, the review takes pains to point out that 


Elizabeth Brown was a founding member of the British Astronomical Association in 1890. 
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In the early 1900s, Marie Stopes received a grant from the Royal Society. 


Somerville was still “beautifully womanly”. 
Somerville had not only translated Pierre-Si- 
mon Laplace’s notoriously difficult Traité de 
Mécanique Céleste (as Mechanism of the Heav- 
ens in 1831), she had also extended it with 
explanatory notes and her book was adopted 
as the standard text for higher mathematics at 
the University of Cambridge, UK. Indeed, the 
term‘scientist’ was coined for Somerville inthe 
1840s by Cambridge don William Whewell, as 
an alternative to ‘natural philosopher’ or ‘man 
of science’. 

Newer learned societies were not so choosy. 
These sprang up in large numbers towards the 
end of the nineteenth century as science spe- 
cialized and associations emerged for amateur 
enthusiasts, teachers and women. Indeed, 
some women took key roles in these societies. 
For example, several were active inthe British 
Astronomical Association, participating in 
expeditions, serving on its council and edit- 
ing its journal. Elizabeth Brown was a found- 
ing member of the association: she headed the 
Solar Section of the Liverpool Astronomical 
Society, formed in 1881, which evolved into 
the British Astronomical Association in1890. 

Astronomy provided particular opportu- 
nity for women, arguably because practition- 
ers remained in the field when other sciences 
professionalized and moved from the home 
to institutional spaces that excluded women. 
Botany, too, with its history as a feminized 
pursuit from the eighteenth century, proved 
welcoming, as did palaeobotany, which was 
strongly female-oriented in the first decades 
of the twentieth century’. Female palaeobot- 
anists researching and publishing at this time 
include Margaret Benson at Royal Holloway 
College, University of London; Agnes Arber, 


who graduated from Newnham Collegein Cam- 
bridge; Henderina Scott, who researched and 
collaborated ina domestic setting; and Marie 
Stopes at the University of Manchester. 


Collaboration sans compensation 


Elite societies might have baulked at having 
female fellows, but women still managed to 
find a way in, and participated in research in 
other ways, too. Between 1880 and 1914, some 
60 women contributed to the Royal Society by 
authoring or co-authoring published papers 
or by demonstrating at the annual soirée, a 
highlight of the London social season that 
continues today’. 

Some women, including palaeontologist 
Dorothea Bate and Stopes (who is best known 
for her later work on birth control and noto- 
rious for her later endorsement of eugenics), 
even received grants from the Royal Society to 
fund their research. Stopes’ scientific career 
saw her travel widely for research, accept gov- 
ernment commissions, publish nearly 40 sci- 
entific papers and produce important insights 
into coal-forest ecology. She earned doctorates 
from the University of Munich in Germany and 
from University College London, and became 
the first woman to join the science teaching 
staff at the University of Manchester. 

Our modern understanding of a salaried sci- 
ence professional did not become completely 
valid until the second decade of the twentieth 
century, although men (and some women) did 
assume such roles from the 1870s onwards, 
often on the back of emerging technologies 
and industries, such as electrical engineer- 
ing. Even when they had university training, 
women tended to secure low-status, routine 
roles such as research assistants and human 
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calculators at, for example, the Royal Observa- 
tory in Greenwich in the 1890s and at Imperial 
College London from its establishment in1907. 

However, it was far from unusual for 
women scientists to work alongside salaried 
men yet receive no remuneration for their 
labours. Bate, for instance, worked with the 
Natural History Museum in London from 
1898, but was never paid and nor was she 
made amember of staff until 1948, when she 
was in her late 60s. The idea of a middle-class 
woman receiving payment violated all ideals 
of respectable femininity. 

Earlier in the century, this concept also 
affected Eleanor Ormerod, who provided 
economic advice on agricultural problems and 
pests. It was easier for a middle-class woman 
of means to carry out research or to do so 
alongside teaching, one of the few respectable 
careers for women. However, working-class 
women could find a pathway into science from 
a business direction. Nautical-instrument 
maker, inventor and navigation writer Janet 
Taylor rananautical academy inthe East End of 
London inthe 1860s and 1870s, with the Admi- 
ralty as one of her clients. 

Ormerod was a pioneering technological 
scientist who was instrumental in establishing 
the discipline of economic entomology in Brit- 
ain, in particular through her annual reports 
published from1877to1901. Although Ormerod 
was self-taught and possessed no formal qualifi- 
cations — something not unusual for women or 
men at thetime, given the amateur traditionin 
science — she advised and lectured on training 
at various colleges and was an examiner at the 
University of Edinburgh, UK. 

Ormerod also participated in international 
collaborative research, acted as an expert wit- 
ness in legal cases and was commissioned as 
a consultant entomologist to the Royal Agri- 
cultural Society in 1882. However, she was not 
paid, and received only occasional expenses, 
despite giving her expertise for free for the next 
ten years. 

Onerouteinto science for womenatthistime 
was through collaboration with a husband or 
other male family member. Yet, even for the 
most egalitarian of scientific partnerships, 
it was the man who tended to get the kudos, 
with his female collaborator cast in the role of 
assistant. 

Many women accepted this. Two examples 
are astronomer Margaret Huggins and Scott, 
a pioneer slow-motion filmmaker, botanist 
and palaeobotanist. Both women were inde- 
pendent researchers, but bought into the era’s 
perceptions about wives being ‘helpmeets’ to 
their husbands. 

Yet Scott’s husband was a strong supporter 
of women scientists, unlike Huggins’s, who 
complained that illness had prevented him 
blocking the award of the Royal Society Hughes 
Medal for original research to electrical engi- 
neer and physicist Hertha Ayrtonin1906. When 
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Ayrton died in 1923, an obituary in Nature 
asserted that, instead of pursuing her ownsci- 
entific interests, she should have looked after 
her husband, and “put him in carpet slippers 
when he came home”, so that he could have bet- 
ter devoted his efforts to his scientific work’. 
Ayrton might have succeeded asa scientist but, 
according to her obituarist at least, she did not 
succeed as a wife. 

Some of the research for which Ayrton was 
honoured had been donein her husband's lab- 
oratories at the Central Institution in Kensing- 
ton, London. This included work on her book 
The Electric Arc (1902) which became the go-to 
resource onthe subject and had been serialized 
in Nature in1899, 

When her husband died, Ayrton lost access 
to this institutional space and so turned her 
living room intoa laboratory. Her confinement 
tothe domestic sphere at a time when empha- 
sis was being placed on precise measurements 
and instrumentation prompted questions over 
her research and the credibility of her science. 

Women had to tread particularly carefully 
when they entered the laboratory, which was 
seenasa space for masculine display. Women’s 
presence there could prompt scepticism, if not 
outright hostility, especially when access was 
for research rather than educational purposes. 
This antagonism often led to the development 
of parallel facilities, such as the Balfour Biolog- 
ical Laboratory for Womenat the University of 
Cambridge in 1884. 

As the new century approached, more 
women were accessing a university educa- 
tion in science, and the idea of a professional 
female researcher was no longer an oddity. The 


Botanical illustrations by Marianne North made their mark in the mid-nineteenth century. 


University of London was a key player here, 
opening upits degrees to women and menon 
anequal basis (except for medicine) from 1878. 

Science was particularly strong at London’s 
Royal Holloway and Bedford women’s colleges. 
When Royal Holloway openedits doors in1886, 
it did so with well-equipped chemical and bio- 
logical laboratories. 

Women were allowed to graduate from Scot- 
tish universities after the passing ofa special act 
in1889 (apart from degrees in medicine, which 
were not conferred on women until 1916). 

But the battle for women’s higher educa- 
tion was not wholly won. That year, physician 
William Withers Moore used an address to the 
British Medical Association to warn against uni- 
versity education for women owing to the “dan- 
gers” it posed to female reproductive health 
and mental well-being. 


“Amoreacceptable route 
into science was teaching in 
one of the colleges or high 
schools for girls.” 


Undaunted by his warnings, some women 
graduates began to take on research posts 
and embark on higher degrees in the United 
Kingdom, Germany and the United States. For 
example, mathematician and biostatistician 
Karl Pearson employed a number of women 
at Galton Laboratory, established in 1904 at 
University College London. Alice Lee, who 
had studied mathematics at Bedford College, 
went on to become a doctor of science under 
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his supervision. Women were not awarded 
degrees at Cambridge until 1948 (27 years after 
Oxford began conferring them), but they did 
study natural sciences and made contributions 
to research. Between 1902 and 1910, female 
researchers at Newnham College were instru- 
mental in founding the science of genetics®, 
working alongside biologist William Bateson. 

Amore acceptable route into science was 
teaching in one of the colleges or high schools 
for girls that were being established at the end 
of the century. Many of the female graduates 
found their scientific niche in teaching, includ- 
ing Cambridge mathematician Sara Burstall, 
who became head of Manchester High School 
for Girls in 1898. 

However, not everyone was pleased with 
this development. Chemist William Armstrong 
used his report for the 1904 Mosely Education 
Commission to emphasize the “mental disabil- 
ities” that evolution had bestowed on women 
and toissue dire warnings about the “ruinous” 
effects of allowing them to “contaminate” boys 
by teaching them science. 

The important work of female scientists 
during the First World War — stepping up 
to run laboratories while men were away at 
the front — is only just now being given due 
credit’. Stopes was recruited to the war effort 
by the UK government’s Industrial Research 
Department, where she collaborated on 
research into the constituents of coal. Hilda 
Phoebe Hudson, like other female mathema- 
ticians, joined the Air Ministry to research 
problems in aeronautical engineering. 

The popular history of women in science 
tends to celebrate romantic ‘heroines’ such 
as Ada Lovelace (who, later in her short life at 
least, used her mathematical prowess mostly 
to gamble) or two-time Nobel-prizewinning 
Curie, rather than the workaday women who 
made their way in science as best they could — 
often very successfully. 

Remembering the breadth of female 
participation will not only end science’s ‘disap- 
pearing womar trick, it might also illuminate 
the current gender imbalance by making the 
point that science is, and always has been, for 
women as muchas for men. 


Claire Jones is a historian of science and 
senior lecturer at the University of Liverpool, 
UK. 
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Crystallographer Kathleen Lonsdale was one of the first two women to be elected as a fellow of the UK Royal Society, in 1945. 


THE WOMEN WHO CRACKED 
THE GLASS CEILING 


After the First World War, female scientists gained footholds in 
academia as well as industrial and government research, despite 
facing prejudice and many other barriers. By Sally Horrocks 


cientific career opportunities saw a 

boost during the First World War as 

a result of the realignment of science 

to the military. For the first time, 

scientists worked on problems rang- 

ing from aviation and submarine detection to 

chemical warfare. After the war, this expansion 

continued, particularly in industry. Biochem- 

ist Kathleen Culhane Lathbury was one female 

scientist who benefited from that. During the 

1920s and early 1930s, she worked for British 

Drug Houses, one of the leading pharmaceuti- 

cal firms inthe United Kingdom, whichI focus 

onhere. Inher post, Lathbury oversawinsulin 
manufacturing. 

But because the drug maker’s dining room 


was male-only, she was excluded from the 
social interactions that happen when dining 
with colleagues. In notes fora talk that she gave 
on women inthe chemical industry, Lathbury 
said that the male graduate “is usually given 
quite a dignified position from the beginning. 
The girl who worked side by side with him 
at the university is hard up and constantly 
humiliated ... Evenif her work is intellectually 
satisfying, she will be expected to attain results 
fromthe ground floor for whichher male equiv- 
alent is given the help of alittle altitude.” 
Inmy role asascience historian, since 2011, 
Ihave been senior academic adviser to An Oral 
History of British Science, a National Life Stories 
projectincollaboration withthe British Library. 
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The project has collected memories of the lives 
and careers of British scientists since the 1940s. 
(Edited extracts are available at www.bl.uk/ 
voices-of-science, with full interviews acces- 
sible at sounds.bl.uk/oral-history/science.) 

In 1922, Lathbury graduated from Royal 
Holloway College in London witha chemistry 
degree. She signed her job applications ‘K. 
Culhane’ to mask her gender, and worked for 
no pay at the Royal Institute of Chemistry, 
concluding that “for women in the chemi- 
cal industry, magnificent health and a thick 
skin are more important than a knowledge 
of chemistry”. 

As her story demonstrates, the inter-war 
period was one of increased employment 
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of women in science, but also of continued 
exclusion and segregation. After the First 
World War, what had been wartime research 
organizations grew, while those established 
before 1914, including corporate laboratories 
that had existed since the early 1890s, con- 
solidated their positions, contributing to the 
growth of anew, technical middle class. But the 
career patterns of female scientists differed 
greatly from those of their male counterparts, 
andthe disparity has persisted, even during the 
Second World War and the first few decades of 
the cold war. 

In the United Kingdom — which focus on 
here — women were also limited by an expecta- 
tion that they would resign from work oncethey 
married. In some cases, including in the civil 
service, such resignation was a formal require- 
ment with limited exceptions, so many of the 
women who enjoyed lengthy careers at govern- 
ment research organizations remained single. 
Women inthe civil service could be exempted 
from this bar if their work was deemed to be of 
sufficient nationalimportance, but, in practice, 
very few actually received exemptions. 

One example of the paucity of exceptions 
was aeronautical engineering researcher 
Frances Bradfield, who studied mathematics 
and physics at Newnham College, Cambridge (a 
women’s college established in 1871). Shejoined 
the UK government’s Royal Aircraft Establish- 
ment (RAE) in Farnboroughin 1918, along with 
fellow Newnham graduate Muriel Barker. 

Bradfield remained at the RAE until her 
retirement in 1955, taking charge of small wind 
tunnels, mentoring many of her younger male 
colleagues and gaining the respect of her peers. 
Barker married colleague Hermann Glauertin 
1922, and left her post. 

Fellow Farnborough employee Beatrice Shil- 
ling, an expert on aero-engines, however, was 
one of the few who received an exemption when 
she married RAE mathematician George Naylor 
in 1938, leaving the RAE only when she retired 
in1969. Shilling developed a device to counter 
engine cut-out in early Spitfire and Hurricane 
planes during the Battle of Britain in 1940. 


Marriage and mobility 


In 1945, X-ray crystallographer Kathleen 
Lonsdale (née Yardley) and biochemist Marjory 
Stephenson became the first two women to be 
elected fellows of the Royal Society, the United 
Kingdom’s national academy of sciences. 
Stephenson, who was employed for much of 
her career by the Medical Research Council, had 
won her first university appointment in 1943. 

Physics Nobel laureate William Henry Bragg 
had supported Lonsdale in her career at Uni- 
versity College London and at the Royal Insti- 
tution in London. Lonsdale worked from home 
after starting a family in1929, and her husband 
assumed domestic responsibilities. A pacifist 
and penal reformer, Lonsdale served a month’s 
sentence in London’s Holloway Prison during 
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Stephanie Shirley built computers at the Post Office Research Station in the 1950s. 


the Second World War because, as a Quaker, 
she refused to register for civil-defence duties. 

Beryl Platt, by contrast, studied engineering 
at the University of Cambridge, UK, andjoined 
the Hawker Aircraft Company in 1943. Platt 
had switched from mathematics to mechan- 
ical engineering (as one of 5 female students 
alongside 250 male undergraduates) when 
she arrived at Girton College in Cambridge 
two years earlier, because the UK government 
offered a state bursary to encourage engineer- 
ing undergraduates as part of the war effort. 
After a brief post-war career in air safety for 
British European Airways, she ended her pro- 
fessional career in engineering when she mar- 
ried textiles manufacturer Stewart Platt in1949. 

Women who married fellow scientists, par- 
ticularly those who worked in universities, 
were sometimes able to continue their involve- 
ment in research. Organic chemist Gertrude 
Robinson, who earned a master’s degree in 
1908, worked at the University of Manches- 
ter asaresearch assistant to Chaim Weizmann 
(who became Israel’s first president in 1949), 
before marrying future Nobel laureate Robert 
Robinson in 1912. She collaborated with him 
on research in organic chemistry, publishing 
more than 30 papers. The couple spent a brief 
period at the University of Sydney in Australia, 
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one of the growing number of universities in 
the English-speaking world that recruited UK 
academic researchers and staff. 

Such international mobility was a feature of 
professional scientific careers from the nine- 
teenth century onwards, but men were more 
likely to take advantage of it than were women. 
More than 16% of UK-born chemists who joined 
the Royal Institute of Chemistry between 1887 
and1943 worked overseas at some point during 
their careers. 


War work 


As the world pivoted towards the Second 
World War in 1939, the United Kingdom started 
to see scientists as a national asset, and the 
Ministry of Labour and National Service estab- 
lished procedures for recruiting and training 
scientists and engineers. Men who were qual- 
ified to embark on courses in the physical 
sciences or in engineering were exempt from 
the armed services while they completed their 
degrees. These were compressed from three 
to two years, even in Scotland, where honours 
degrees typically last for four years. But the 
ministry actively discouraged universities 
from increasing the proportion of female 
students in science and engineering, despite 
the nation’s demand for expertise. 


DAME STEPHANIE SHIRLEY 
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Bothwomen and men were directed into war 
work after completing their studies, however. 
Some were roped in even earlier. For example, 
microbiologist Nada Jennett (née Phillips) and 
fellow University of Bristol students spent one of 
their holidays working for pharmaceutical com- 
pany Glaxo on penicillin production problems. 

After the war, Jennett trained as a teacher 
and worked in laboratories at the university 
and ina hospital in Cardiff until her first child 
was born. She taught science part-time before 
returning to microbiology, then developed a 
second career lecturing in garden design. 

For men, wartime work was often the 
foundation of long and successful careers, 
but for women it generally represented a 
short interlude before full-time domestic 
responsibilities, which might be followed by 
unpaid voluntary work or by part-time paid 
employment, but rarely a permanent post. 
Some employers who had been reluctant to 
hire women relented, among them Imperial 
Chemical Industries (ICI), then Britain’s largest 
chemical manufacturer. 

ICI advertisements specified a preference 
“for women chemists of British national- 
ity”, perhaps helping to explain why refugee 
women who were scientists were not always 
able to find relevant work, even if they had 
impressive qualifications. In March 1941, for 
example, the journal Chemistry and Industry 
carried this advert: “LADY CHEMIST. Ger- 
man Refugee, aged 37. PhD (Berlin), seeks a 
position. Some research experience in Rub- 
ber Chemistry and accustomed to conduct 
searches in libraries and translate from Ger- 
manand French.” 

Women who were married, had children and 
had left science to concentrate on domestic 
responsibilities but wanted to contribute to 
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the war effort also found suitable work hard to 
come by. Lathbury, for one, ended up working 
in statistical quality control at the Royal Ord- 
nance Factory after a brief stint asa wages clerk. 

In 1939, Joan Strothers and Sam Curran, then 
physics PhD students at the Cavendish Labo- 


“For many women, 
continuing to work after 
marriage was often the only 
practical option.” 


ratory in Cambridge, were trying to developa 
proximity fuse, an explosives detonator that 
triggered only when near the target. They mar- 
ried a year later and moved to the Telecom- 
munications Research Establishment, where 
Curran worked oncentrimetric radar systems 
for installation in aircraft, while Strothers was 
part of the countermeasures group. Here, 
she developed the idea that led to Operation 
Window — the scattering of strips of metallic 
foil from aircraft to deceive enemy radar, a 
technique that was successfully used on D-Day. 


Expanding opportunities 
Towards the end of the Second World War, 
workforce planners expected a contraction 
in military research, enabling UK industry to 
recruit researchers to help recover the econ- 
omy after the war. But this contraction proved 
short-lived. Defence research, including work 
on a British atomic-bomb project, rapidly 
expanded inthe late 1940s and early 1950s, cre- 
ating many newjobs in research organizations. 
Asmallbutgrowingnumber of graduate-level 
female scientists found employment in 
defence-research establishments and, thanks 
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Engineer Beryl Platt (left) with an associate on the occasion of his wedding. 
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tothe 1946 abolition of the marriage bar inthe 
civil service, could nowcontinue their careers 
after marriage. 

However, without maternity-leave 
legislation or a provision for childcare, many 
married women could not continue to work. 
And, although some enjoyed long careers, few 
reached senior positions. 

An exception was the naval engineer 
Elizabeth Killick, whose career began inthe 
early 1950s. Killick, who died inJuly 2019 aged 
94, became deputy chief scientific officer 
and head of the Weapons Department at the 
Admiralty Underwater Weapons Establish- 
ment.1In1982, she also became the first woman 
to be elected to what is now the Royal Academy 
of Engineering. 

Expanded UK government support for 
health, education, employment and social 
security after the Second World War also gener- 
ated newopportunities for scientists, including 
posts in biological sciences, which tended to 
be popular with female researchers. Organiza- 
tions suchas the UK Public Health Laboratory 
Service and the advice services coordinated 
by what was then the Ministry of Agriculture, 
Fisheries and Food also employed women. 

Measures agreed in 1955 meant that from 
1960, women whoworked for thestate received 
the same wages as men. For women, this made 
careers in government research and univer- 
sities more attractive than those in industry, 
in which differential pay rates and benefits 
remained the norm. 

But even after the lifting of the marriage 
bar, women who did secure permanent aca- 
demic posts often had to assume significant 
teaching and administrative burdens while 
their male colleagues were free to focus on 
research —work that brought greater prestige 
and faster promotion. 

In 1947, for example, Florence R. Shaw was 
appointed to an assistant lectureship at Uni- 
versity College, Leicester (now the University 
of Leicester), and was promoted to lecturer 
in 1948. But she published little after being 
elected a fellow of the Royal Institute of Chem- 
istry in1949 and, onher retirement in 1965, was 
praised for her teaching contribution as “a 
loyal and steadfast colleague inthe Chemistry 
Department, to whom many of our graduates 
owe a great deal”. 

Female researchers who pursued scientific 
careers during the post-war period faced 
emotional and practical challenges in the 
predominantly male environments. Many 
experienced self-doubt and had to come up 
with strategies to improve their status without 
seeming to be openly confrontational. 

Stephanie Shirley, who arrived in the United 
Kingdom in 1939 as a refugee from Nazi 
Germany, worked at the Post Office Research 
Station inthe 1950s, building computers from 
scratch. She recalls, “If you’re the only one, if 
you fail, you fail for all women, and they say, 
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In the 1940s, Beatrice Shilling developed a device to stop aeroplane engines cutting out. 


‘Well, we tried one of those and she was awful. 
Whereas if you succeed, it’s also remembered, 
but somehowthe presumption is that, ‘We had 
her and she was good; at least we'll try another 
one and see if it works again.” 

Inthe late 1960s, recognition of the barriers 
towomen’s access to scientific careers began 
to grow. These obstacles came to be seen as 
problems that needed to be addressed, rather 
than as the inevitable consequences of wom- 
en’s prioritizing of family obligations over 
career aspirations. From the 1970s, many of 
these formal barriers were removed. Female 
scientists inthe United Kingdom and elsewhere 
benefited from legislative changes that pro- 
moted greater equality in employment and 
provided for maternity leave. 

The three key pieces of UK legislation were 
the Equal Pay Act (1970); the Sex Discrimination 
Act (1975), which outlawed discrimination in 
employment onthe grounds of gender or mar- 
ital status; and the Employment Protection Act 
(1975), which established the principle of paid 
maternity leave, although it did not initially 
cover all women. 

Inthe United States, Title IX of the Education 
Amendments Act (1972) outlawed discrimi- 
nation on the basis of gender in education or 
activities receiving federal funding. But as 
Margaret Rossiter showed in the 2012 third vol- 
ume of her book Women Scientists in America, 
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those researchers had to fight hard to ensure 
it was implemented. 

Ataglobal level, the United Nations decreed 
1975 to be International Women’s Year, and 
the first UN Conference on Women was 
held that year in Mexico City. In 1979, the UN 


“Greater diversity in 
the workforce cametobe 
seen as an economic asset.” 


Convention on the Elimination of All Forms of 
Discrimination Against Women was adopted. 
The European Economic Community (from 
1993, the European Community; from 2009, 
the European Union) was also a powerful force 
for promoting equality legislation in its mem- 
ber states, including the extension of mater- 
nity leave to all working women in the United 
Kingdomin1993 and the extension of paternity 
leave in 2010. (Paid paternity leave was intro- 
duced in the United Kingdom in 2003.) 
Legislative change and international con- 
ventions did not mean, however, that the 
expectations of employers or female scientists 
themselves changed suddenly, or that discrim- 
ination disappeared overnight. 
Meteorologist Julia Slingo, whose first 
daughter was born in 1980, opted to leave 
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her job at the UK Met Office rather than take 
maternity leave. She returned to work in1981 
after being offered flexible working arrange- 
ments, an option she continued to take advan- 
tage of even after she accepted a new role in 
the United States in 1986. She later returned to 
full-time work and enjoyed asuccessful career 
before retiring in 2016 as chief scientist of the 
Met Office, a year after she was elected a fellow 
of the Royal Society. 

Such flexible arrangements became more 
widely available from the 1990s. This was 
because greater diversity in the workforce 
came to be seen as an economic asset, making 
gender equality a matter of sound business 
practice rather than merely about the pursuit 
of social justice. 

This business-case approach has also 
prompted efforts in Europe and the United 
States to address other aspects of diversity, 
including factors such as ethnicity, disabil- 
ity, sexual orientation and socio-economic 
status. Such an approach tends to focus on 
providing equality of opportunity to existing 
educational and employment structures rather 
than — as feminist critics have been advocat- 
ing since at least the 1990s — on challenging 
the imbalances of power that form the basis 
of under-representation. 

British female scientists who started their 
careers in the years after the First World War 
were a small minority ina relatively new pro- 
fession that was concentrated in Europe and 
North America and was only just beginning to 
emerge elsewhere. 

Their counterparts in the twenty-first cen- 
tury are members of a global community of 
nearly 8 million researchers. More than 40% of 
those are in Asia, although the proportion of 
female researchers worldwide is less than 30%. 
Whereas many of the formal barriers to wom- 
en’s participation in UK science that existed 
in 1919 disappeared in the twentieth century, 
many fields continue to be numerically and 
structurally male. In these areas, career pro- 
gress for women — as was the case a century 
ago — involves a challenging process of trying 
to work in male-oriented environments while 
seeking to maintain their own gender identities. 

Female scientists might no longer be forced 
to choose between career or marriage and 
family. But they continue to face many chal- 
lenges, with workplace cultures and reward 
structures still designed mainly to accommo- 
date male-oriented norms and career paths. 


Sally Horrocks is associate professor of 
contemporary British history at the University 
of Leicester, UK. She thanks members of the 
An Oral History of British Science team past 
and present, as well as Liz Bruton (Science 
Museum, London) and Graeme Gooday 
(University of Leeds, UK) for advice and 
encouragement. 

e-mail: smh4@leicester.ac.uk 
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CONTAINERS 
IN THE CLOUD 


Standardized platforms allow researchers to run each other’s 
software — no installation required. By Jeffrey M. Perkel 


urphy’s law for the digital age: 
anything that can go wrong, will go 
wrong during alive demonstration. 
For Ben Marwick, that happened in 
front of aroomful of landscape-ar- 
chaeology students in Berlin. The topic: com- 
putational reproducibility using Docker. 
Docker is a software tool that generates 
‘containers’ — standardized computational 
environments that can be shared and reused. 
Containers ensure that computational analy- 
ses always run on the same underlying infra- 
structure, fostering reproducibility. Docker 
thereby insulates researchers from the chal- 
lenges of installing and updating research 
software. However, it can be difficult to use. 
Marwick, an archaeologist at the Univer- 
sity of Washington in Seattle, had become 


proficient in migrating Docker configuration 
files (Dockerfiles’) from one project to the 
next, making minor tweaks and getting them 
to work. Colleagues in Germany invited him 
to teach their students how to follow suit. But 
because every student hada slightly different 
set of hardware and software installed, each 
one required acustomized configuration. The 
demo “was acomplete disaster”, Marwick says. 

Today, a growing collection of services 
allows researchers to sidestep such confu- 
sion. Using these services — which include 
Binder, Code Ocean, Colaboratory, Gigantum 
and Nextjournal — researchers can run code 
in the cloud without needing to install more 
software. They can lock down their software 
configurations, migrate those environments 
from laptops to high-performance computing 
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clusters and share them with colleagues. 
Educators can create and share course materi- 
als with students, and journals canimprove the 
reproducibility of results in published articles. 
It’s never been easier to understand, evaluate, 
adopt and adapt the computational methods 
on which modern science depends. 

William Coon, asleep researcher at Harvard 
Medical School in Boston, Massachusetts, 
spent weeks writing and debugging an algo- 
rithm, only to discover that a colleague’s con- 
tainerized code could have saved alot of time. 
“I could have just gotten up and running, using 
all of the debugging work that he had already 
done, at the click of a button,” he says. 

Scientific software often requires installing, 
navigating and troubleshooting a byzantine 
network of computational ‘dependencies’ 
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— the code libraries and tools on which each 
software module relies. Some have to be com- 
piled from source code or configured just so, 
and aninstallation that should take a few min- 
utes can degenerate into a frustrating online 
odyssey through websites such as Stack Over- 
flow and GitHub. “One of the hardest parts of 
reproducibility is getting your computer set 
up in exactly the same way as somebody else’s 
computer is set up. That is just ridiculously dif- 
ficult,” says Kirstie Whitaker, a neuroscientist 
at the Alan Turing Institute in London. 


Easier evaluation 


Docker reduces that to a single command. 
“Docker really provides reduced friction for 
that stage of the cycle of reproducing some- 
body else’s work, in which you have to build 
the software from source and combine it with 
other external libraries,” says Lorena Barba, a 
mechanical and aerospace engineer at George 
Washington University in Washington DC. “It 
facilitates that part, making it less error-prone, 
making it less onerous in researcher time.” 

Barba’s team does most of its workin Docker 
containers. But that isa computationally savvy 
research group; others might find the pro- 
cess daunting. A text-based ‘command-line’ 
application, Docker has dozens of options, 
and building a working Dockerfile can be an 
exercise in frustration. 

That’s where the cloud-based services come 
in. Binder is an open-source project that allows 
users to test-drive computational notebooks 
— documents such as Jupyter or R Markdown 
notebooks, which blend code, figures and text. 
Colaboratory (free), Code Ocean, Gigantum 
and Nextjournal (the latter three have free and 
paid tiers) let users write code in the cloud as 
welland, insome cases, bundle it with the data 
to be processed. These platforms also allow 
users to modify the code and apply it to other 
datasets, and provide version-control features 
for reviewing changes. 

Such tools make it easier for researchers 
to evaluate their colleagues’ work. “With 
Binder, you have taken that barrier [of soft- 
ware installation] away,” says Karthik Ram, a 
computational ecologist at the University of 
California, Berkeley. “If] can click that button, 
be dropped into a notebook where everything 
isinstalled, the environment is exactly the way 
you intended it to be, then you've made my life 
easier to go take alook and give you feedback.” 

Identifying required dependencies, and 
where to find them, varies with the platform. 
On Code Ocean and Gigantunm,, it’s a point- 
and-click operation, whereas Binder requires 
alist of dependencies ina Github respository. 
Whitaker’s advice: codify your computing 
environment as early as possible ina project, 
and stick with it. “Ifyou try and doit at the end, 
then you are basically doing archaeology on 
your code, and it’s really, really hard,” she says. 
Ram developed a tool called Holepunch for 
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projects that use the statistical programming 
language R. Holepunch distils the process of 
setting up Binder into four simple commands. 
(See examples of our code running on all five 
platforms at go.nature.com/2ps9sel.) 

The easiest way to try Binder is at 
mybinder.org, a free, albeit computationally 
limited, website. Or, for greater power and 
security, researchers can build private ‘Bin- 
derHubs’ instead. The Alan Turing Institute has 
two, including one called Hub23 (areference to 
Hut 23 at the Second World War code-breaking 
facility at Bletchley Park, UK), that provides 


“Researchers can be 
confident that their code will 
remain usable, whichever 
platform they choose.” 


greater computational resources and the 
ability to work with data sets that cannot be 
publicly shared, Whitaker says. The Pangeo 
community, which promotes open, reproduc- 
ible and scalable geoscience, built a dedicated 
BinderHub so that researchers can explore 
climate-modelling and satellite data sets that 
can amount totens of terabytes, says Joe Ham- 
man, a computational hydroclimatologist at 
the National Center for Atmospheric Research 
in Boulder, Colorado. (Whitaker’s team has 
published a tutorial on building a BinderHub 
at go.nature.com/349jscv.) 


Languages and clouds 


Google’s Colaboratory is basically a cross 
between a Jupyter notebook and Google 
Docs, meaning users can share, comment on 
and jointly edit notebooks, which are stored 
on Google Drive. Users execute their code in 
the Google cloud — only the Python language 
is officially supported — ona standard central 
processing unit (CPU), a graphics processing 
unit (GPU) or a tensor processing unit (TPU), 
aspecialized chip optimized for Google’s Ten- 
sorFlow deep-learning software. “Youcanopen 
up your notebook or someone else’s notebook 
from GitHub, start playing around with it and 
then save your copy on Google Drive and work 
onit later,” says Jake VanderPlas, amember of 
the Colaboratory team at Google in Seattle. 

Nextjournal supports notebooks written 
in Python, R, Julia, Bash and Clojure, with 
more languages in development. According 
to Martin Kavalar, chief executive of Nextjour- 
nal, whichis based in Berlin, the company has 
registered nearly 3,000 users since it launched 
the platform on 8 May. 

Gigantum, abeta version of which launched 
last year, features a browser-based client 
that users can install on their own system or 
remotely, for cloud-based coding and execu- 
tion in the Jupyter and RStudio coding envi- 
ronments. Coon, who uses Gigantum to run 
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machine-learning algorithms in the Amazon 
cloud, says the service makes it easy for collab- 
orators to hit the ground running. “[They] can 
read through my Gigantum notebooks and use 
this cloud-compute infrastructure to do the 
training and learning,” he explains. 

Then there’s Code Ocean, which supports 
both notebooks and conventional scripts in 
Python, R, Julia, Matlab and C, among other 
languages. Several journals now use Code 
Ocean for peer review and to promote com- 
putational reproducibility, including titles 
from Taylor & Francis, De Gruyter and SPIE. 
In2018, Nature Biotechnology, Nature Machine 
Intelligence and Nature Methods \aunched a 
pilot programme to use Code Ocean for peer 
review; Nature, Nature Protocols and BMC 
Bioinformatics subsequently joined the trial. 
More than 95 papers have now been involvedin 
the trial, according to Erika Pastrana, editorial 
director of Nature Research’s applied-science 
and chemistry journals, and more than 20 of 
those have been published. 

Felicity Allen, a computer scientist at the 
Wellcome Sanger Institute in Hinxton, UK, 
co-authored one study in that trial, which 
analysed the types of mutation that can arise 
from CRISPR-based gene editing (F. Allen etal. 
Nature Biotechnol. 37, 64-72; 2019). She esti- 
mates that it took a week to get the Code Ocean 
environment working. “The reviewers seemed 
to really like it,” Allen says. “And I think it was 
really nice that it made an example that some- 
one could just press ‘go’ on and it would run.” 

Although some worry about the long-term 
viability of commercial container-computing 
services, researchers do have options. Simon 
Adar, chief executive of Code Ocean, notes that 
Code Ocean ‘compute capsules’ are archived 
by the CLOCKSS project, which preserves dig- 
ital copies of online scientific literature. And 
Code Ocean, Gigantum and Nextjournal allow 
Dockerfiles to be exported for use on other 
platforms. All of which means that researchers 
can be confident that their code will remain 
usable, whichever platform they choose. 

Benjamin Haibe-Kains, a computational 
pharmacogenomics researcher at the Princess 
Margaret Cancer Centre in Toronto, Canada, 
adopted Code Ocean to respond quickly to 
critiques of an analysis he published in Nature 
(B. Haibe-Kains et al. Nature 504, 389-393; 
2013). For him, Code Ocean provides a way 
to ensure his code can be used and evaluated 
by his team, peer reviewers and the broader 
scientific community. “It’s not so much that 
an analysis must be correct or wrong,” he says. 
“Nothing is really fully correct in this world. 
However, if you’re very transparent about it, 
you can always communicate efficiently inthe 
face of criticism. You have nothing to hide; 
everything is there.” 


Jeffrey M. Perkel is technology editor at 
Nature. 
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’ve been a scientific glassblower for 

33 years. For much of that, I’ve worked 
at the University of Oxford, where 
design and create glass equipment that 
scientists can use for their research. 

The piece I’m most proud of is a perfusion 
apparatus that is used to keep human 
organs functioning outside the body. But! 
also make glassware that is used throughout 
the university, such as high-vacuum 
manifolds, which are a series of knobs that 
operate a vacuum; glass apparatus for 
distillation and sublimation experiments; 
vessels with water jackets used to heat 
and cool materials; and high-temperature 
furnace tubes made of quartz or ceramic. 

My workbench hosts an array of tools 
for working with glass, many of which were 
custom-made for specific jobs. Each tool 
reminds me of what I first used it for and 
makes me consider how! might use it again. 

Most are made of carbon, and need to 
be highly polished before use because any 
irregularities will be transferred to the finish 
onthe glass. 

My workbench is also where | ponder the 
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design of new glassware. It’s quite easy to 
sketch something ona piece of paper, but 
reproducing that concept as a workable 
piece of glass equipment is amuch more 
difficult endeavour. 

I find that a lot of my work relies on 
intuition: | instinctively know when the glass 
is the right temperature, or at what speed it 
needs to rotate on the lathe. Usually, I can tell 
when it’s turning fast enough by the sound of 
the lathe. 

Glassblowing is a declining art — worldwide, 
there aren’t many schools that teach it any 
more. A lot of the work can be done bya 
computer, and there are now alternative 
materials to non-magnetic glassware. 

However, I learn something new almost 
every week, and am inspired by knowing 
that alittle piece of glassware that I’ve 
made has contributed in some way to the 
bigger picture of science when a researcher 
achieves milestone results. 


Terri Adams is a professional glassblower 
at the University of Oxford, UK. Interview by 
Sarah Boon. 


